Archive for the Tips and Tricks Category

ODI Hidden Gems – Unique temporary object names

Posted in Gems, ODI, Tips and Tricks with tags , , on March 11, 2021 by Rodrigo Radtke de Souza

Hi all, I was not going to write about this one because I thought that this “hidden” gem was already known to every single ODI 12 developer out there, but I still get questions on why sometimes some specific data loads fails when they run in parallel, and they work fine when they run in serial. Most of the times, those are related to how ODI handles the temporary objects that it creates to do ETL (like C$, I$, E$ tables).

Let us see one example, which is the default in ODI. I have created one very simple mapping that has one source and one target table.

This mapping is loading data from two databases that resides in different data servers, so it will need to create a C$ table to be able to transfer the data. If we look at ODI Operator, we will notice the following:

The C$ table that it created is named as C$_0SECTIONTYPE. By default, ODI will create this name based on the source component that the data was generated from, so in this case it was a table called SECTIONTYPE. The “0” in front of it is an incremental number that would increase if you had another source with the same name in the mapping. For example, if you had SECTIONTYPE mapped twice as source tables, one would be loaded as C$_0SECTIONTYPE and the other C$_1SECTIONTYPE. ODI does that so we do not have a clash between names within the same ODI mapping.

However, what would happen if you tried to run the same mapping or another mapping that also contains SECTIONTYPE as source at the same time? As you may imagine, one mapping would interfere in the other, since both C$ tables would be called C$_0SECTIONTYPE and both mappings would be trying to load/read/drop it at the same time, which would cause a failure (in a good scenario) or wrong data (in a bad scenario).

To avoid this kind of issue to happen, ODI 10/11 developers were very creative in the past and would create some Java variables and some tweaks in some ODI KMs to make the temp table names dynamic. However, ODI 12 introduces something way simpler to handle this kind of situation, however, its not default and it is kind of “hidden”. If you go back to the mapping, click on the Physical tab, and scroll all the way down. You will notice a check box that says “Use Unique Temporary Object Names” that is unchecked. If you check this one and run it again, you will see the difference.

ODI now created a table named C$_0SECTIONTYPEAIHLNQMAVPK7Q1FR66UM225DF3, which is totally unique, and it will never clash with another mapping running in parallel. But then, another question arises: should I go to each mapping to check this option, if I want all to be unique? Well, the answer is no. There is another “hidden” gem that you can use.

Go to ODI Topology and double click your ODI agent, then go to Properties.

Luckily, you may enable the use of unique temporary objects at an ODI agent level, so you do not need to go back to each mapping and changing them.

That is it for today. See ya!

Advertisement

Generate JSON objects with ODI in a very easy way

Posted in ODI, Tips and Tricks with tags , on November 9, 2020 by Rodrigo Radtke de Souza

Hi all. I came across a requirement to create Json objects from a set of Oracle tables. We have ODI, so it was natural that the solution would be created there. However, working with JSON objects in ODI is not that easy. For those who already worked with “Complex File” technology knows what I am talking about. Too much setup and any misconfiguration will cause an error that is generally very hard to troubleshoot.

In the past, I did several mappings that would read from Complex Files (including Json), but this time it was an outbound process, so I would need to create the Json objects, not read from them. I tried to search on the internet, but nothing was clear. I was not sure if the Complex File technology would work to create outbound files and I was not in the mood to play with XSD files this time, so I needed to find some other solution.

Talking to a friend/co-worker of mine, he asked me if we could not leverage the Oracle’s JSON_OBJECT function somehow in ODI. First, I did not know what that function was about, so I researched about it:

The SQL/JSON function JSON_OBJECT takes as its input one or more property key-value pairs. It returns a JSON object that contains an object member for each of those key-value pairs.

It looked very interesting. So, I give it a try in SQL Developer. It worked very well:

Basically, you may add any number of columns to the function and it will create a valid JSON object out of it. Pretty neat, right? Now it is just a matter of how to add it to ODI, which thankfully is very easy. First, I created a model in the File technology:

It contains only one column, with 4000 characters. Then I created a mapping and mapped all the required columns to the JSON_OBJECT function, like this:

That’s it. Pretty simple. When we run the mapping, we have an outbound file like this:

Now you may send the outbound file to any application that needs to consume JSON objects. One thing that you may need to consider is that you will probably get into trouble if your Json objects ends up being larger then 4000 characters each. I didn’t test it, but either Oracle or ODI will probably complain about it.

I hope you have liked it. This is not a “traditional” way to create Json objects, but for sure it is the easiest one! See you next time!

Oracle SQL for EPM Tips and Tricks S01EP14

Posted in ODI, Oracle, Oracle Database, Performance, SQL, Tips and Tricks with tags , , , on October 15, 2020 by RZGiampaoli

Hey guys how are you? Continuing our SQL series (S01EP13), today I’ll share a very hand little query that I use very often for check data duplication. In fact, this would be an upgrade version of ODI’s pk check.

An upgrade version because in ODI, if you enable PK check, if he finds duplication, he eliminate both data. This code I’ll show you, you would choose if you want to keep the last created duplication or the oldest one, but only one will be eliminated.

I have a test table with this values:

If I want to check for duplicate PK, I can just run this query here:

The Idea here is, we have 2 queries. The first one will check if the ROWID it has is bigger or smaller (you choice) than the MIN or MAX ROWID (depending of your previous choice) than the second sub query by any joins you want to check.

In this case, we wanted to check only if the PK column had duplicated values, but we could check any other column by just replace it in the join. In fact, we could have any amount of columns in the join and that would check if there’s any duplications in all columns you inserted there.

Then you can select the first using > and MIN or the last by using < and MAX as well you can select what column you want to check in the where clause.

One important thing to mention is that this query is meant to work as a delete because it’ll keep what was not in the select. What I mean is, if you have more than one duplication, it’ll bring, in this case, all the rows that has the ROWID > then the one selected in the first query:

Then if I have multiple duplications, the query will return everything that needs to be deleted and the only one remaining was the first one inserted (3, Chuck, Giampaoli).

I hope you enjoy this little trick and see you soon.

Oracle Always Free cloud offering (Part 1)

Posted in ACE, Cloud, Oracle, Oracle Database, Tips and Tricks with tags , , , on May 6, 2020 by RZGiampaoli

Hey guys how are you?

I decide to do some posts about Oracle Always free offering, how it works, how you setup things, a few things we can do with that and maybe more. I think is fair for us to start by what’s it and what you need to do to get one.

Basically Always Free is a services for anyone that wants to try the world’s first self-driving database and Oracle Cloud Infrastructure for an unlimited time. The ideas is let people explore the full functionality of Oracle Autonomous Database and Oracle Cloud Infrastructure, including Compute VMs, Block and Object Storage, and Load Balancer, all of the essentials for developers to build complete applications on Oracle Cloud. 

Oracle’s Free Tier program has two components:

  • Always Free services, which provide access to Oracle Cloud services for an unlimited time
  • Free Trial, which provides $300 in credits for 30 days to try additional services and larger shapes

The new Always Free program includes the essentials users need to build and test applications in the cloud: Oracle Autonomous Database, Compute VMs, Block Volumes, Object and Archive Storage, and Load Balancer. Specifications include:

  • 2 Autonomous Databases (Autonomous Data Warehouse or Autonomous Transaction Processing), each with 1 OCPU and 20 GB storage
  • 2 Compute VMs, each with 1/8 OCPU and 1 GB memory
  • 2 Block Volumes, 100 GB total, with up to 5 free backups
  • 10 GB Object Storage, 10 GB Archive Storage, and 50,000/month API requests
  • 1 Load Balancer, 10 Mbps bandwidth
  • 10 TB/month Outbound Data Transfer
  • 500 million ingestion Datapoints and 1 billion Datapoints for Monitoring Service
  • 1 million Notification delivery options per month and 1000 emails per month

Well, if you ask me this is far better than install an Oracle XE in your machine and configure everything there for you to learn or to create some small app. In fact, if you want to learn, it’ll far better if you start learning in an cloud environment since everyday we have more and more companies migrating to cloud.

Ok, but what do you need to do to get one? In fact is very easy, you just need to access this link and click in the Start for Free button. After that you have to fill a short form where you need to inform:

  • Your email and user information like address and cellphone
  • You need to validate your cellphone through message (Oracle will send a code to your cell)
  • You need to choose the region you’ll going to have you OCI (Oracle Cloud Infrastructure)
    • This needs to be as close as possible as your real region to decrease latency and improve network performance
    • Some regions are not available for always free (it’s written next to the region name if is available or not)
  • And you need to add a credit card to your account
    • You’ll not be charged but you may see 1 Dollar/Euro/… getting charged but it’ll be return
    • Also, Revolut card don’t work, you need a proper credit card.

And that’s it, Oracle will create your account (in fact takes around 15 minutes until you receive a email with further instructions [Bare in mind that because the COVID-19, it’s taking several days to create a new account]). After you receive your email, you can login in your dashboard and start to create your network, Disk, Database, VM’s and more.

We’ll see how to configure a database in my next post. I hope you enjoy this and see you soon.

DEVEPM in Kscope20!

Posted in Essbase, Hacking, Kscope 20, ODI 12c, ODI Architecture, Tips and Tricks with tags , , , , on March 2, 2020 by RZGiampaoli

We are delighted to tell everybody that we’ll be at KScope 20 in Boston.

We’ll be presenting the session Essbase Statistics DW: How automatically administrate Essbase using ODI on room 304, Level 3 => Wed, Jul 01, 2020 (02:15 PM – 03:15 PM). In this session we’ll talk about how to keep Essbase configuration up to date using ODI. We also will review

To have a fast Essbase cube, we must keep vigilance and follow its growth and its data movements, so we can distribute caches and adjust the database parameters accordingly. But this is a very difficult task to achieve since Essbase statistics are not temporal and only tell you the cube statistics are in that specific time frame.
This session will present how ODI can be used to create a historical DW containing Essbase cube’s information and how to identify trends and patterns, giving us the ability to programmatically tune our Essbase databases automatically.

We hope to see you all there.

Thank you and see you soon!

Oracle SQL for EPM Tips and Tricks S01EP10

Posted in ACE, Hacking, Oracle, Oracle Database, Performance, SQL, Tips and Tricks with tags , , , , , , on February 26, 2020 by RZGiampaoli

Hey guys how are you?

Today a quick tip that I think is very useful. From time to time the business ask us to validate if a table has data or not before we load it. It’s fare, specially if you use a truncate and insert approach.

The problem is, sometimes, the table/view they are asking for has millions of rows, and there’s no other safe way to validate if a table has data or not than querying it.

I just fixed a case where an interface had a validation that basically counts 3 different tables that together had 40 million rows per period. This validations were taking around 1000 sec to happens.

The data load that happens before that took 1200 sec. Then, basically the validation process were taking as much time as the load process.

After some changes, the query now is validating the 3 tables in 0.3 seconds. Way better than before. Basically I just used 3 things:

The hint /*+ FIRST_ROWS(1) */ that makes oracle prepare the best plan to query just one row (in my case since I used 1 as parameter.

The filter ROWNUM = 1 to make sure oracle just return 1 row, if we don’t use that, the hint can make everything very slow because oracle will be planning for just one row, but without filtering it’ll bring more (using the best plan possible for 1 row).

And UNION ALL instead of UNION, because there’s a huge difference between them. when you use UNION, oracle matches the sets of data to make sure you have unique rows after that. UNION ALL in other case, just bring everything each set return without any extra process to validate anything. UNION ALL is always faster than UNION.

In the end I have an query like this:

As you can see, the query is very simple and for this example I just had the name of the table there, then we know the table is not empty for that period. We can do other approach like summing then all together and validate if the results is = 3 for example or any other logic we need can be implemented on top of this query.

I hope this is helpful for you guys and see you in the next post.

Oracle SQL for EPM Tips and Tricks S01EP09

Posted in ACE, Hyperion Planning, Oracle, Oracle Database, Performance, Tips and Tricks with tags , , , , , , on February 19, 2020 by RZGiampaoli

Hey Guys how are you?

Today it’ll be a quick tip for you guys that like/need to query the Planning repository.

The Planning repository stores both the Plan Type and the Consolidation in a very particular way, in fact this is true for a lot of other things like security, form properties etc… but I’ll focus these 2, that are the more often used and the solution is the same for all of them anyways.

If we take a look in the HSP_PLAN_TYPE table we’ll have something similar to this (depending in how many plan types you have in your app).

As we can see Planning stores in this table all the plan types that were created when we setup the application. In my case I have 4 plan types and we can have up to 5 BSO plan types in a Planning app. Now, if we join the HSP_OBJECT and the HSP_MEMBER filtering the OBJECT_TYPE = 2 we can take a look in all the dimensions we set in the repository.

The USED_IN columns is the column that says to planning in witch plan type that member will exists. The interesting thing here is that, you don’t see the PLAN_TYPE ID that you supposed to right? And that is because a member can exists in more than 1 plan type right, and if we use the PLAN_TYPE ID straight, we would need one row for each plan type right?

Instead, we have just one row but we also have the ability to tell Planning where that member should exists, and we can do that by summing the PLAN_TYPE ID’s together. In the example above, the Account dimension exists in all 4 plan types (1+2+4+8 = 15). Now the Products dimension exists only in one plan type (1), and by the number you can say that is the Pnl Plan type.

you seen the idea here is to check if a PLAN_TYPE ID exists inside that number we have here in the USED_IN column. Another example is the Employee dimension that has the USED_IN set as 8. The only number that will fit in here is the 8 itself (1+2+4 = 7, 1+8 = 9…) meaning the Plan type is WrkForce.

I think the most used way for us to figure out if a number exists inside another number is to use MOD.

  • CASE WHEN MOD(USED_IN,2)>=1 THEN 1 ELSE 0 END PT_1
  • CASE WHEN MOD(USED_IN,4)>=2 THEN 1 ELSE 0 END PT_2
  • CASE WHEN MOD(USED_IN,8)>=4 THEN 1 ELSE 0 END PT_3
  • CASE WHEN MOD(USED_IN,16)>=8 THEN 1 ELSE 0 END PT_4
  • CASE WHEN MOD(USED_IN,32)>=16 THEN 1 ELSE 0 END PT_5

The Oracle MOD(N,M) is used to return the remainder of a dividend divided by a divisor where:

NDividend.
MDivisor.
Pictorial Presentation of MOD() function

Then in our case, we need to test if the USED_IN number contains the PLAN_TYPE ID on it, and for that we need to MOD it by rolling sum of the plan types + 1. To make it easier I’ll put that in numbers:

  • N = USED_IN = 31 (max number possible)
  • M = PLAN_TYPE ID = 1 (Pnl) what I want to test) + 1 = 2
  • MODE (31, 2) = 1
  • 31/2 = 15 Reminder = 1
  • MOD = 1

What that is telling us is that if the MOD is = 1, the Plan type 1 exists in that number. I run a simulation just to show us when the Plan Type 1 does not exists in the USED_IN:

As we can see, the Plan Type 1 only exists in the odd possible results (as expected) what means in any other possible combination of the other 4 plan types he doesn’t exists (2, 4, 2+4=6, 8, 8+2=10, 8+4=12, 8+2+4=14, 16, 16+2=18, 16+4=20, 16+2+4=22, 16+8=24, 16+8+2=26, 16+8+4=28, 16+8+2+4=30).

The same is true for the other Plan types, you can try then out using the MOD. Now, this work well but there’s a way easier and clean way to do exactly the same thing using the function BITAND.

The BITAND function treats its inputs and its output as vectors of bits, the output is the bitwise AND of the inputs. Basically it performs below steps.

  • Converts the inputs into binary.
  • Performs a standard bitwise AND operation on these two strings.
  • Converts the binary result back into decimal and returns the value.

Ok, it looks more complicated now, but the good news is that to use is simpler than it sounds like. The main difference between this function and MOD is that MOD returns a boolean, BITAND return the value you asked if it’s true. Expanding my previous test using BITAND:

As you can see, with BITAND returning the number you asked for instead of 0 or 1 make it possible for us to Join the HSP_PLAN_TYPE with HSP_MEMBER using the USED_IN and the PLAN_TYPE in the BITAND Function as a Join:

As you can see, this is a far better way to split the members by Plan Type. And now we can see that the Dimension Products only exists in the Plan Type Pnl and that Entity exists in 4 different plan types. We don’t need to worry about any mathematics formula to create all our MODs, we just need to Join our Plan Type table with the BITAND of USED_IN by PLAN_TYPE.

The Consolidation is another place where you can use the exactly same thing. Instead of using something like this:

  • DECODE(MOD(CONSOL_OP,8),’0′,’+’,’1′,’-‘,’2′,’*’,’3′,’/’,’4′,’%’,’5′,’~’,’6′,’Never’,CONSOL_OP) CON1
  • DECODE(MOD(TRUNC(CONSOL_OP/8),8),’0′,’+’,’1′,’-‘,’2′,’*’,’3′,’/’,’4′,’%’,’5′,’~’,’6′,’Never’,CONSOL_OP) CON2
  • DECODE(MOD(TRUNC(CONSOL_OP/64),8),’0′,’+’,’1′,’-‘,’2′,’*’,’3′,’/’,’4′,’%’,’5′,’~’,’6′,’Never’,CONSOL_OP) CON3
  • DECODE(MOD(TRUNC(CONSOL_OP/512),8),’0′,’+’,’1′,’-‘,’2′,’*’,’3′,’/’,’4′,’%’,’5′,’~’,’6′,’Never’,CONSOL_OP) CON4
  • DECODE(MOD(TRUNC(CONSOL_OP/4096),8),’0′,’+’,’1′,’-‘,’2′,’*’,’3′,’/’,’4′,’%’,’5′,’~’,’6′,’Never’,CONSOL_OP) CON5

You can create a parameter table with the values for each kind of consolidation and use BITAND with that info:

I hope you this will be useful for you guys and see you soon.

Fragmented and Aggregated tables in OBIEE using ODI Part 5/5: Setting the OBIEE Repository

Posted in Hacking, OBIEE, ODI, ODI Architecture, Oracle, Oracle Database, Tips and Tricks with tags , , , , on February 13, 2020 by RZGiampaoli

Hey guys, how are you?

Finally, we have arrived in the final chapter of the series Fragmented and Aggregated tables in OBIEE and today we are talking about how to Setting the OBIEE Repository.

Just to make easier for you to navigate in this series, here’s the parts of it:

Creating the tables: I’ll explain the advantages and how you can create the fragmented tables and also how to further improve performance using partitioning and sub partitioning.

Managing the partitions of the Fragmented/Aggregated tables: Here you’ll see how to manage all partitions using ODI.

Populating the Fragmented tables using ODI: Here you’ll find how to create generic components to load all fragmented tables we need to.

Populating the Aggregated tables using ODI: Here you’ll find how to create generic components to load all Aggregated tables we need to.

Setting the OBIEE Repository: Finally, we’ll going to setting up the OBIEE repository to make use of all tables.

This post does not intend to be a step by step how to create an OBIEE repository for beginner or anything like that. My intend is to show the main points that we need to do to make our infrastructure to work in OBIEE. Also, I’m working in OBIEE 12c but this will work in the same way in OBIEE 11 too.

Let’s start then from the beginning. After we import all the tables to our repository the first thing, we need to do is to create the joins between the Dimensions and the Fact tables.

Right now, we have an important point to discuss about constraints. We can have the tables create with Primary Keys and Foreign Keys if you want, as well as not null and any other constraints you wish. The thing is, these things normally impact negatively in the data load times and since we are using ODI, we can have ODI to handle this kind of thing during the data load.

Instead of have a PK or an FK we can have a Flow control in ODI checking the metadata before load it. I always prefer this approach for the simple fact that ODI will generate an E$ table with all fallouts for me automatically, and this is very helpful for debugging.

In my case, I left the table without any constraints or Keys, then the first thing I need to do is to join all our star schema together. Since we have 18 table, all table needs to be joined to all Dimensions in the same way except the Period dimensions.

The Period Dimensions will tell OBIEE what is the set of tables he needs to query. If a user does an analysis in a quarter level, with our design, OBIEE must query only the Quarterly aggregated tables. That’s why we have 3 period dimensions, one for each level of aggregation.

For the DIM_PERIOD (the detailed dimension) we’ll going to join it with all detail Fact tables. As you can see, we joined with 3 “D” tables (BS, Income, PL2) and with the other 3 “E” table (same as before).

For the DIM_PERIOD_MONTH we’ll going to join it with all Monthly Fact tables. As you can see, we joined with 3 “D” tables (in the “M” level) and with the other 3 “E” table (also in the “M” level).

And for the DIM_PERIOD_QUARTER we’ll going to join it with all Monthly Fact tables. As you can see, we joined with 3 “D” tables (in the “Q” level) and with the other 3 “E” table (also in the “Q” level).

This is the first step to make OBIEE work with Aggregated tables. The second and last step we need to do is in the Business layer.

After we finish to join everything (if you have all FK’s in place, you’ll not need to do the joins, OBIEE will load then for you) we can start to do our final settings in the Business layer. In this layer is where we’ll going to tell OBIEE how to behave in front of the aggregate tables and the fragmented tables as well.

First, let’s address the Period Dimension. We’ll drag and drop the more detailed dimension first (DIM_PERIOD) and then we’ll going to drag and drop the other 2 period dimensions on top of the first one. This will create 3 sources in that logical dimension.

If you click in each source, you’ll see that OBIEE will automatically map the columns (By Column Name, then all columns must have the same name [case sensitive]).

As you can see, OBIEE maps the columns available in each dimension, making the Fiscal Quarter column for example, have 3 different sources, one for the DIM_PERIOD_QUARTER, another for the DIM_PERIOD_MONTH and one last one for the DIM_PERIOD.

The next thing we need to do is create a dimension for the DIM_PERIOD logical table. This is the last step needed for OBIEE decide which table it’ll query depending of the analysis created. As I said before, if the user does an analysis at quarter level, OBIEE will know by the DIM_PERIOD dimension and the Table sources that the smaller table to query is the DIM_PERIOD_QUARTER, because it’ll be in the beginning of the Drill path.

OBIEE knows for the design of the drill that the Years level has less members than the Quarter level and so on. That’s how OBIEE defines the aggregate table he’ll query.

The last thing we need to do is in the fact table, and it’ll be done at same time we and in the same place we set the fragmentation content. For the Fact tables we’ll do the same thing as the Period. We’ll drag any Fact table first and then we’ll going to drag all the other 17 tables on top of it like this:

As you can see, we have all sources under the same logical table and in the same way of the DIM_PERIOD, OBIEE will map all columns to the right source. In my case you can see that the Details Sources has more columns than the Aggregated Source (as expected).

At this point is important to point out that OBIEE will always going to try to get the most aggregated table possible but, if an user does an analysis at quarter level but ask for a column that only exists in the Detail table, OBIEE will be obliged to query the detail level and ask the database to aggregate the data for us (making the query slower).

Now, we have only one more thing to do for our architecture to work. We need to define which fragmented table OBIEE will access depending of the Source System and the Account hierarchy name. To do that, we’ll have to add a very simple parameter, that can be very complex if we don’t design well, to the Sources in the fact table.

Inside each Source we have a tab called “Content” and in that table we can specify some very important things:

First, we can/need to specify the Logical level that will be used for each dimension in relation to the fact table. What I mean for that is, for example, for the detail table, every dimension will be using the Detail level of the Dimensions (leaf level) as we can see in the image above. For the Monthly level Fact table, instead of the leaf level, we’ll be using the monthly level of the Period Dimension. That’s the last piece of configuration for the aggregated tables. With this setting OBIEE will know that for that Level of Dimension, he should be using the fact that have the logical level set as Month.

The second important thing we need to set in this tab is the fragmentation filter, and by that we have a field called Fragmentation Content. In this section we’ll going to use a Dimension or more to filter the content. What OBIEE does in this case is, depending of what is selected in the analysis, it’ll select one or more table to query.

For example, in our case we want to, when the Account HIER_NAME is equal to “BS” we want OBIEE to use only the BS tables, if is “INCOME” the use the INCOME tables and lastly if is “PL2” he needs to use the PL2 tables.

It’s nice to know that you don’t need to have the column you want to use in the fact tables, for example, the HIER_NAME column is the highest level of the Account Hierarchy and we don’t have any information regarding this in the fact table. OBIEE just read the Filter and select the right table.

Another very important point about the fragmentation content is that, in cases that you have more than one option, you need to do all possible combinations for that to work properly. For example, if we are doing fragmentation with 2 dimensions, like we are doing, and the dimension A has the values A, B and C and dimension B has values 1, 2 and 3, if the user can select more than 1 value you need to do something like this:

(Dimension A = A and Dimension B = 1) or (Dimension A = A and Dimension B = 2) Or…..

You need to have all possible combinations because in this setting if you say something like Dimension A in (A, B, C) this will only be valid if the user select all 3 values in the dashboard. If he selects just A and B, this filter will not be used.

Then in our case, for simplicity, I had to create an UDA for the Source System otherwise I would have to create all possible combinations between Hier_Name and Source System. Then In my DIM_SOURCE_SYSTEM I have something Like this:

As you can see, the UDA split my Source Systems in the same way I split the data in the table. In the E tables I have just EMC data and in the D tables I have DELL, DTC and STAT data. This allows me to do a simple filter in the Fragmentation Content filter making our lives way easier.

The third important thing is that, in our case, since we can have in an analysis 2 or more sources at same time, for example, the user can select the Source System Dell and EMC, we need to flag the option “This source should be combined with others at this same level”.

This will make OBIEE ALWAYS create an UNION ALL between at least one D table and one E table, even if the user select just EMC for example, we’ll have the UNION ALL between the same level (Month for example) with the filter Source System = ‘EMC’, making the result set return just EMC data.

If we don’t flag this option, OBIEE will never have 2 fragmented table at same time, and that’s not what we want here.

Then basically we have 3 configurations to do in our 18 sources. Looks a lot but is very simple in the end. I create a color code to try make it easier for us to see all the configurations in our source. Yellow is the configuration regarding the Source System, Green is related with the Account Hier_Name and Red is regarding the level of the aggregated data.

As you can see, we have our 3 configurations combined in our 18 sources.

  • Period Aggregation:
    • For detail Fact table we assign the Leaf level of periods;
    • For Month Fact table we assign the Month level of periods;
    • For Quarter Fact table we assign the Quarter level of periods;
  • Account Fragmentation:
    • For BS Fact table we filter HIER_NAME = ‘BS’;
    • For INCOME Fact table we filter HIER_NAME = ‘INCOME’;
    • For PL2 Fact table we filter HIER_NAME = ‘PL2’;
  • For Source System Fragmentation:
    • For EMC Fact tables (E tables) we filter UDA = ‘E’;
    • For Dell, DTC and STAT Fact tables (D tables) we filter UDA = ‘D’;

And that’s all we need to do to config OBIEE for this architecture. It’s looks overwhelming but in fact is very simple and very fast to do it, and the performance gains are absurd. With this approach I can query 15 quarter of data in the quarter level in 5 seconds. Billions of data in 5 seconds, it’s a lot.

One thing that I would like to mentioning is that normally in the Business Layer is where I rename all the columns for a more business friendly. In this case I decide to do a little test and I left all the names in the same way it’s in the Physical Layer and decide to create Aliases in the Presentation layer. I did that for 2 very simple reasons, one is that it’s easier to just drag and drop staff from the Physical Layer to the Business Layer if everything has the same name. If things don’t match, he duplicates columns, you need to drag and drop column over column, one by one and it’s a lot of work. Second because I wanted to test if this approach is better than my old one or not.

I don’t have any opinion about that yet and in fact, I could had renaming everything and if I need to expand to 36 table for example, I could rename back the columns, do all the mappings and rename back again, then not sure what’s the best approach on that.

It was way more work to rename stuff in the Presentation Layer because the Rename Wizard doesn’t create aliases, then I had to manually rename column by column then I still not sure about this approach.

And this is the end of our Fragmented and Aggregated tables in OBIEE using ODI. I hope this is helpful and see you in my next post.

Fragmented and Aggregated tables in OBIEE using ODI Part 4/5: Populating the Aggregated tables

Posted in Hacking, OBIEE, ODI, ODI Architecture, Oracle, Oracle Database, Tips and Tricks, Uncategorized with tags , , , , on February 12, 2020 by RZGiampaoli

Hey guys, how are you?

Continuing the series Fragmented and Aggregated tables in OBIEE using ODI and today we are talking about how to Populating the Aggregated tables using ODI.

Just to make easier for you to navigate in this series, here’s the parts of it:

Creating the tables: I’ll explain the advantages and how you can create the fragmented tables and also how to further improve performance using partitioning and sub partitioning.

Managing the partitions of the Fragmented/Aggregated tables: Here you’ll see how to manage all partitions using ODI.

Populating the Fragmented tables using ODI: Here you’ll find how to create generic components to load all fragmented tables we need to.

Populating the Aggregated tables using ODI: Here you’ll find how to create generic components to load all Aggregated tables we need to.

Setting the OBIEE Repository: Finally, we’ll going to setting up the OBIEE repository to make use of all tables.

Today we are in the final step before we can work in our OBIEE repository to put all these 18 fact tables together. The data load for our Aggregated tables.

The aggregation in fact is a very simple process, we just need to remove all detailed columns we have in the detail fact table and leave just the ID’s columns. After that we just need to reduce the level of the ID of the column we want to aggregate and sum all the data at that new level.

In our case we going to use the PERIOD_ID to do that, because period is the most common choice when we talk about aggregated table and serve well in most of the cases.

And that’s why I design the PERIOD_ID as YYYYQMMDD, because is very simple and easy to manipulate this number to go up or down a period, as well to do range or even transform it back to date. It’s way easier than create a surrogate key or whatever, even if you want to work with SCD.

As you probably already guest by now, we’ll use the command on source and command on target again to do the aggregations, this way we can have only one code to spread the data through out aggregate fact tables.

In the command on source for the monthly level table, we just need a query in the source that return the name of the detailed table plus the name of the monthly table. Since I designed all tables with a specific name pattern, we can easily manipulate the table name to get the month table from the detail table like this:

We don’t need anything fancy here because in the last post we create a proc to call six time the same scenario passing different parameter to it.

One of these parameter is the name of the fact table it needs to be loaded and this information is store in the variable #JAT_TABLE_NAME (already replaced by the value in the picture) and what I have done there is just split the table name using REGEXP to get the forth and the fifth occurrences of ‘_’ and concatenate everything back adding a _M_ in the middle of it, creating my Monthly level fact table name.

With data we have the detail table name that the scenario needs to load and also the monthly level fact table name that we need to use for that loop. We just need to create a query to aggregate the data, what’s very straight forward.

As the query will change depending of the design of the table, this one is just intended for explain what needs to be done since the idea is very simple. First we replace the name of the table that  we wish to insert data and the table that we wish to get the data from for our to variables: #JAT_TABLE_NAME that we are sending when we call this scenario and #JAT_M_TABLE_NAME that we just create in the Command on Source tab.

With this, for each loop, the scenario will get the data from the right source table and will insert in the right aggregated table. Now we need to aggregate the data, also a very simple matter. All we need to do is to join the detailed fact table with the period dimension and, because this is a range partition, we need to get the first date of that month and the last date of that month, that’s why we have that MIN and MAX filtered by the Year and Month.

With the right data filtered, we just need to aggregate the data and use the FISCAL_MONTH_ID instead of the PERIOD_ID, this way the data will be aggregated by month. And we are done here.

By the way, we could instead of using the between to get the right range of the partition filtered the partition itself using explicit partition filtering:

This will make oracle to go straight to that partition. The only thing is that you need to pass the partition name, that in our case is very straight forward, specially because we are creating and managing the partitions ourselves. We could have oracle create the partitions automatically using INTERVAL (for another post) but if we do that oracle will also create the names of partitions like SYS###### and that will make everything harder to filter by partitions. Anyway, this is just another option we can have.

For the quarter level it’s even easier because we don’t need to worry about range partitions anymore. We just need to have our Command on Source return the Monthly level table name and our Quarterly level table name:

As you can see, the query is the same, the only difference is that we insert a ‘_Q_’ in the middle of our string. And for the insert in the Target Command tab we just need, as before, replace the tables using the right variable in each case, join with the DIM_PERIOD_MONTH to have the right level of data, filter the Monthly level table using any method and then use the QUARTER_ID to sum the Monthly level data to the Quarterly level data and that’s it.

That’s all we need to do to populate all aggregated table. And we finally have all data populated in our tables and now we can start to create the OBIEE repository. It’s wort to mentioning that if you resume everything that I said until this point, we basically had to:

  • Create our tables;
  • Create one procedure to:
    • Create the partitions;
    • Truncate the partitions before we load data;
    • Call and loop the scenario that will load data;
    • Drop the old partitions
  • Create another procedure to:
    • Load detail data;
    • Load Monthly level data;
    • Load Quarterly level data;

What this means is that with just two procedure we can maintain and populate a very complex environment in a very clean way with very little points of failures.

This is a very elegant approach and as I always said, if is too complex and/or difficult, you were doing something wrong. It is just a meter of think about the design first, all that needs to be done, and split it in a logical and reusable way. If you do that, everything will be simple to be archived using ODI.

I hope this series has been useful so far and I see you in the last chapter.

Fragmented and Aggregated tables in OBIEE using ODI Part 2/5: Managing the partitions of Fragmented/Aggregated tables

Posted in OBIEE, ODI, ODI Architecture, Oracle, Oracle Database, Performance, SQL, Tips and Tricks with tags , , , on February 10, 2020 by RZGiampaoli

Hey guys, how are you?

Continuing the series Fragmented and Aggregated tables in OBIEE and today we are talking about how to Populating the Fragmented tables using ODI.

Just to make easier for you to navigate in this series, here’s the parts of it:

Creating the tables: I’ll explain the advantages and how you can create the fragmented tables and also how to further improve performance using partitioning and sub partitioning.

Managing the partitions of the Fragmented/Aggregated tables: Here you’ll see how to manage all partitions using ODI.

Populating the Fragmented tables using ODI: Here you’ll find how to create generic components to load all fragmented tables we need to.

Populating the Aggregated tables using ODI: Here you’ll find how to create generic components to load all Aggregated tables we need to.

Setting the OBIEE Repository: Finally, we’ll going to setting up the OBIEE repository to make use of all tables.

In my previous post we design 18 tables and its partitions and sub-partitions. Now what we need to do is to make ODI manage these partitions for us. As I said before, the Sub partitions will be static and if we need to create one more sub-partitions, we’ll be doing that manually. That’s always the best choice because manage sub-partitions add a layer of complicity that is not needed.

With that said, if you need to manage the sub-partitions, you can use the exactly same approach I’ll show here just adapting to the needs of the sub-partition’s creation and sub-partition template update.

Let’s start analyzing what needs to be done. For each table we need to issue this command to the database:

For List partitions:

               ALTER TABLE table_name ADD PARTITION partition_name VALUES (value[, value]…);

For Range Partitions:

               ALTER TABLE table_name ADD PARTITION partition_name VALUES LESS THAN (value[, value]…);

Sounds hard but in fact is very simple and it’ll going to work in any SQL you have. To do that this we will use a procedure with our ALTER TABLE statement in the Command on target tab and a SELECT in the command on source to pass the information we need to create the partitions like the table names, the partition name and the partition values.

For each row that the SELECT on source returns, the statement in the Command on target will be executed, and we can pass the values from the SELECT in the Command on Source to change the behavior of the statement in the Command on target.

This is something very important that we need to understand about ODI. ODI It’s not an ETL tool but a code generator tool that orchestrate Database and a lot of other tools if you will.

What I mean by that is, until ODI send the code for the database to execute, everything that ODI does is to generate code. If we use the ODI API it’ll replace it before send to the database as well to replaces the variable with their values, meaning a variable like this #SQL on the Command on Target tab could contain an entire dynamically generate INSERT that ODI would replace before send it to the Database.

And this is why ODI is so powerful, because we can manipulate string and make everything dynamic.

Then our query in the source should return something like this:

The query will change depending of your design of the application then I don’t think sharing my query will going to help. One tip I can give you is always compare the name of the partition with the content of the column PARTITION_NAME on ALL_TAB_PARTITIONS view to see if the partition you want to create already exists to prevent an error in ODI, in fact you have 2 choices, either do this or flag the step as Ignore Error to end the step with a warning. I prefer to compare and end the step without warnings, then I compare.

Another thing to notice is that for FY20 Feb, my PARTITION_VALUE (MAX_PERIOD) is the 1 day after the last day of the month. It’s hard to see here because the calendar used by the client is all different but if you have the normal calendar and the mask you have for you PERIOD_ID is YYYYMMDD, for 2020/Feb you partition ID needs to be 20200230 (Feb ends 02/29/2020).

That’s because this is a RANGE partition, and for RANGE partition we define the partition value as VALUES LESS THAN, what means that everything under 20200230 will be FEB unless we have a smaller partition (JAN) below it. But for example, if you smallest partition is 2018/Jan, everything below that period will be inserted in the 2018/Jan partition, then you need be careful with when you’ll start to create partitions.

Another important thing to mention is that you cannot insert a partition smaller than an existing partition, just bigger. That means, if you start with 2018/Jan, it’s impossible to create a 2017 partition, the only way is to drop the table and create the table starting by 2017. A RANGE partition needs to be sequential and always growing.

Having these values returning from the SELECT on the Command on source tab, the only thing that is left is to use these variables in the statement in the command on Target tab like this:

ALTER TABLE #TABLE_NM ADD PARTITION PARTITION_NM VALUES LESS THAN (#MAX_PERIOD);

Pay attention on the values of the partitions. In this case the values are numeric, that is why I’m not using quotes. If the values were String, we need to enclosure the variable with quotes.

And that’s it, this is all we need to do to manage partitions using ODI. Because the SELECT on the source returns All tables with their partitions name and values and we pass to the TARGET statement this information, for each row in the source the target will be looped, creating the partitions you need for each table.

We are done for the fragmented tables. Now we need to manage the partitions for the Aggregated table. The approach is the exactly the same, the only thing that will change is the content of the query from the Command on Source tab and that the PARTITION now will be a LIST Partition instead of a RANGE partition.

We could have done just one query to return all values including if that was a RANGE or a LIST partition, but for the sake of simplicity is always better to split the scenarios to not over complicate something simple.

For the month level we need the query in the Source to return:

And the statement in the Command on Target should be like this:

ALTER TABLE #TABLE_NM ADD PARTITION PARTITION_NM VALUES (#MAX_PERIOD);

That’s all that needs to be done. For the quarter level the results are similar:

And the statement in the Command on Target should be like this:

ALTER TABLE #TABLE_NM ADD PARTITION PARTITION_NM VALUES (#MAX_PERIOD);

As you can see, it’s very simple to manipulate and manage the database with ODI, we just need to create a query with the information we want in the source, and replace the target statement with that information, that’s all we need to do.

Another important thing that we need in our procedure, in case you want to go with the truncate/insert approach, is to truncate the partitions before load. This is also managed by ODI and also works in the exactly same way as the CREATE PARTITIONS but a little bit simpler since we don’t need to have the partitions values:

And the statement in the Command on Target should be like this:

ALTER TABLE #TABLE_NM TRUNCATE PARTITION PARTITION_NM;

For month level:

And the statement in the Command on Target should be like this:

ALTER TABLE #TABLE_NM TRUNCATE PARTITION PARTITION_NM;

And for the Quarter Level:

And the statement in the Command on Target should be like this:

ALTER TABLE #TABLE_NM TRUNCATE PARTITION PARTITION_NM;

Also, could be done with just one query but again, simplicity works always better than complexity.

This is all we need to start loading data. One last thing we could do, if is you case, is to create a purge process to drop old partitions. If your business requires 5 years of data, it’s a good idea to drop older partitions. The approach is exactly the same

In fact, the results of the Command on source are exactly the same as the ones we need for the TRUNCATE PARTITION steps, the only difference is that you need to have in place a logic to find the old partitions. Other than that, the only thing that changes are the Statement on the Target that should be:

ALTER TABLE #TABLE_NM DROP PARTITION PARTITION_NM;

And that’s it, we just create a procedure to manage the partitions for all our 18 tables including create, truncate and drop old partitions.

In the next part of this series we’ll going to talk about populating the Fragmented tables.

I hope you find this helpful and see you soon.