Performance | DEV EPM

Archive for the Performance Category

Oracle SQL for EPM Tips and Tricks S01EP14

Posted in ODI, Oracle, Oracle Database, Performance, SQL, Tips and Tricks with tags Oracle ACE, oracle Database, SQL, Tips and Tricks on October 15, 2020 by RZGiampaoli

Hey guys how are you? Continuing our SQL series (S01EP13), today I’ll share a very hand little query that I use very often for check data duplication. In fact, this would be an upgrade version of ODI’s pk check.

An upgrade version because in ODI, if you enable PK check, if he finds duplication, he eliminate both data. This code I’ll show you, you would choose if you want to keep the last created duplication or the oldest one, but only one will be eliminated.

I have a test table with this values:

If I want to check for duplicate PK, I can just run this query here:

The Idea here is, we have 2 queries. The first one will check if the ROWID it has is bigger or smaller (you choice) than the MIN or MAX ROWID (depending of your previous choice) than the second sub query by any joins you want to check.

In this case, we wanted to check only if the PK column had duplicated values, but we could check any other column by just replace it in the join. In fact, we could have any amount of columns in the join and that would check if there’s any duplications in all columns you inserted there.

Then you can select the first using > and MIN or the last by using < and MAX as well you can select what column you want to check in the where clause.

One important thing to mention is that this query is meant to work as a delete because it’ll keep what was not in the select. What I mean is, if you have more than one duplication, it’ll bring, in this case, all the rows that has the ROWID > then the one selected in the first query:

Then if I have multiple duplications, the query will return everything that needs to be deleted and the only one remaining was the first one inserted (3, Chuck, Giampaoli).

I hope you enjoy this little trick and see you soon.

Leave a comment »

Oracle SQL for EPM Tips and Tricks S01EP10

Posted in ACE, Hacking, Oracle, Oracle Database, Performance, SQL, Tips and Tricks with tags Hacking, Oracle, Oracle ACE, oracle Database, performance, PL/SQL, SQL on February 26, 2020 by RZGiampaoli

Hey guys how are you?

Today a quick tip that I think is very useful. From time to time the business ask us to validate if a table has data or not before we load it. It’s fare, specially if you use a truncate and insert approach.

The problem is, sometimes, the table/view they are asking for has millions of rows, and there’s no other safe way to validate if a table has data or not than querying it.

I just fixed a case where an interface had a validation that basically counts 3 different tables that together had 40 million rows per period. This validations were taking around 1000 sec to happens.

The data load that happens before that took 1200 sec. Then, basically the validation process were taking as much time as the load process.

After some changes, the query now is validating the 3 tables in 0.3 seconds. Way better than before. Basically I just used 3 things:

The hint /*+ FIRST_ROWS(1) */ that makes oracle prepare the best plan to query just one row (in my case since I used 1 as parameter.

The filter ROWNUM = 1 to make sure oracle just return 1 row, if we don’t use that, the hint can make everything very slow because oracle will be planning for just one row, but without filtering it’ll bring more (using the best plan possible for 1 row).

And UNION ALL instead of UNION, because there’s a huge difference between them. when you use UNION, oracle matches the sets of data to make sure you have unique rows after that. UNION ALL in other case, just bring everything each set return without any extra process to validate anything. UNION ALL is always faster than UNION.

In the end I have an query like this:

As you can see, the query is very simple and for this example I just had the name of the table there, then we know the table is not empty for that period. We can do other approach like summing then all together and validate if the results is = 3 for example or any other logic we need can be implemented on top of this query.

I hope this is helpful for you guys and see you in the next post.

1 Comment »

Oracle SQL for EPM Tips and Tricks S01EP09

Posted in ACE, Hyperion Planning, Oracle, Oracle Database, Performance, Tips and Tricks with tags Data Warehouse, Hyperion Planning, Oracle, oracle Database, PL/SQL, Planning, Planning Repository on February 19, 2020 by RZGiampaoli

Hey Guys how are you?

Today it’ll be a quick tip for you guys that like/need to query the Planning repository.

The Planning repository stores both the Plan Type and the Consolidation in a very particular way, in fact this is true for a lot of other things like security, form properties etc… but I’ll focus these 2, that are the more often used and the solution is the same for all of them anyways.

If we take a look in the HSP_PLAN_TYPE table we’ll have something similar to this (depending in how many plan types you have in your app).

As we can see Planning stores in this table all the plan types that were created when we setup the application. In my case I have 4 plan types and we can have up to 5 BSO plan types in a Planning app. Now, if we join the HSP_OBJECT and the HSP_MEMBER filtering the OBJECT_TYPE = 2 we can take a look in all the dimensions we set in the repository.

The USED_IN columns is the column that says to planning in witch plan type that member will exists. The interesting thing here is that, you don’t see the PLAN_TYPE ID that you supposed to right? And that is because a member can exists in more than 1 plan type right, and if we use the PLAN_TYPE ID straight, we would need one row for each plan type right?

Instead, we have just one row but we also have the ability to tell Planning where that member should exists, and we can do that by summing the PLAN_TYPE ID’s together. In the example above, the Account dimension exists in all 4 plan types (1+2+4+8 = 15). Now the Products dimension exists only in one plan type (1), and by the number you can say that is the Pnl Plan type.

you seen the idea here is to check if a PLAN_TYPE ID exists inside that number we have here in the USED_IN column. Another example is the Employee dimension that has the USED_IN set as 8. The only number that will fit in here is the 8 itself (1+2+4 = 7, 1+8 = 9…) meaning the Plan type is WrkForce.

I think the most used way for us to figure out if a number exists inside another number is to use MOD.

CASE WHEN MOD(USED_IN,2)>=1 THEN 1 ELSE 0 END PT_1
CASE WHEN MOD(USED_IN,4)>=2 THEN 1 ELSE 0 END PT_2
CASE WHEN MOD(USED_IN,8)>=4 THEN 1 ELSE 0 END PT_3
CASE WHEN MOD(USED_IN,16)>=8 THEN 1 ELSE 0 END PT_4
CASE WHEN MOD(USED_IN,32)>=16 THEN 1 ELSE 0 END PT_5

The Oracle MOD(N,M) is used to return the remainder of a dividend divided by a divisor where:

N	Dividend.
M	Divisor.

**Pictorial Presentation of MOD() function**

Then in our case, we need to test if the USED_IN number contains the PLAN_TYPE ID on it, and for that we need to MOD it by rolling sum of the plan types + 1. To make it easier I’ll put that in numbers:

N = USED_IN = 31 (max number possible)
M = PLAN_TYPE ID = 1 (Pnl) what I want to test) + 1 = 2
MODE (31, 2) = 1
31/2 = 15 Reminder = 1
MOD = 1

What that is telling us is that if the MOD is = 1, the Plan type 1 exists in that number. I run a simulation just to show us when the Plan Type 1 does not exists in the USED_IN:

As we can see, the Plan Type 1 only exists in the odd possible results (as expected) what means in any other possible combination of the other 4 plan types he doesn’t exists (2, 4, 2+4=6, 8, 8+2=10, 8+4=12, 8+2+4=14, 16, 16+2=18, 16+4=20, 16+2+4=22, 16+8=24, 16+8+2=26, 16+8+4=28, 16+8+2+4=30).

The same is true for the other Plan types, you can try then out using the MOD. Now, this work well but there’s a way easier and clean way to do exactly the same thing using the function BITAND.

The BITAND function treats its inputs and its output as vectors of bits, the output is the bitwise AND of the inputs. Basically it performs below steps.

Converts the inputs into binary.
Performs a standard bitwise AND operation on these two strings.
Converts the binary result back into decimal and returns the value.

Ok, it looks more complicated now, but the good news is that to use is simpler than it sounds like. The main difference between this function and MOD is that MOD returns a boolean, BITAND return the value you asked if it’s true. Expanding my previous test using BITAND:

As you can see, with BITAND returning the number you asked for instead of 0 or 1 make it possible for us to Join the HSP_PLAN_TYPE with HSP_MEMBER using the USED_IN and the PLAN_TYPE in the BITAND Function as a Join:

As you can see, this is a far better way to split the members by Plan Type. And now we can see that the Dimension Products only exists in the Plan Type Pnl and that Entity exists in 4 different plan types. We don’t need to worry about any mathematics formula to create all our MODs, we just need to Join our Plan Type table with the BITAND of USED_IN by PLAN_TYPE.

The Consolidation is another place where you can use the exactly same thing. Instead of using something like this:

DECODE(MOD(CONSOL_OP,8),’0′,’+’,’1′,’-‘,’2′,’*’,’3′,’/’,’4′,’%’,’5′,’~’,’6′,’Never’,CONSOL_OP) CON1
DECODE(MOD(TRUNC(CONSOL_OP/8),8),’0′,’+’,’1′,’-‘,’2′,’*’,’3′,’/’,’4′,’%’,’5′,’~’,’6′,’Never’,CONSOL_OP) CON2
DECODE(MOD(TRUNC(CONSOL_OP/64),8),’0′,’+’,’1′,’-‘,’2′,’*’,’3′,’/’,’4′,’%’,’5′,’~’,’6′,’Never’,CONSOL_OP) CON3
DECODE(MOD(TRUNC(CONSOL_OP/512),8),’0′,’+’,’1′,’-‘,’2′,’*’,’3′,’/’,’4′,’%’,’5′,’~’,’6′,’Never’,CONSOL_OP) CON4
DECODE(MOD(TRUNC(CONSOL_OP/4096),8),’0′,’+’,’1′,’-‘,’2′,’*’,’3′,’/’,’4′,’%’,’5′,’~’,’6′,’Never’,CONSOL_OP) CON5

You can create a parameter table with the values for each kind of consolidation and use BITAND with that info:

I hope you this will be useful for you guys and see you soon.

Leave a comment »

Fragmented and Aggregated tables in OBIEE using ODI Part 2/5: Managing the partitions of Fragmented/Aggregated tables

Posted in OBIEE, ODI, ODI Architecture, Oracle, Oracle Database, Performance, SQL, Tips and Tricks with tags OBIEE, ODI, ODI Architecture, Tips and Tricks on February 10, 2020 by RZGiampaoli

Hey guys, how are you?

Continuing the series Fragmented and Aggregated tables in OBIEE and today we are talking about how to Populating the Fragmented tables using ODI.

Just to make easier for you to navigate in this series, here’s the parts of it:

Creating the tables: I’ll explain the advantages and how you can create the fragmented tables and also how to further improve performance using partitioning and sub partitioning.

Managing the partitions of the Fragmented/Aggregated tables: Here you’ll see how to manage all partitions using ODI.

Populating the Fragmented tables using ODI: Here you’ll find how to create generic components to load all fragmented tables we need to.

Populating the Aggregated tables using ODI: Here you’ll find how to create generic components to load all Aggregated tables we need to.

Setting the OBIEE Repository: Finally, we’ll going to setting up the OBIEE repository to make use of all tables.

In my previous post we design 18 tables and its partitions and sub-partitions. Now what we need to do is to make ODI manage these partitions for us. As I said before, the Sub partitions will be static and if we need to create one more sub-partitions, we’ll be doing that manually. That’s always the best choice because manage sub-partitions add a layer of complicity that is not needed.

With that said, if you need to manage the sub-partitions, you can use the exactly same approach I’ll show here just adapting to the needs of the sub-partition’s creation and sub-partition template update.

Let’s start analyzing what needs to be done. For each table we need to issue this command to the database:

For List partitions:

ALTER TABLE table_name ADD PARTITION partition_name VALUES (value[, value]…);

For Range Partitions:

ALTER TABLE table_name ADD PARTITION partition_name VALUES LESS THAN (value[, value]…);

Sounds hard but in fact is very simple and it’ll going to work in any SQL you have. To do that this we will use a procedure with our ALTER TABLE statement in the Command on target tab and a SELECT in the command on source to pass the information we need to create the partitions like the table names, the partition name and the partition values.

For each row that the SELECT on source returns, the statement in the Command on target will be executed, and we can pass the values from the SELECT in the Command on Source to change the behavior of the statement in the Command on target.

This is something very important that we need to understand about ODI. ODI It’s not an ETL tool but a code generator tool that orchestrate Database and a lot of other tools if you will.

What I mean by that is, until ODI send the code for the database to execute, everything that ODI does is to generate code. If we use the ODI API it’ll replace it before send to the database as well to replaces the variable with their values, meaning a variable like this #SQL on the Command on Target tab could contain an entire dynamically generate INSERT that ODI would replace before send it to the Database.

And this is why ODI is so powerful, because we can manipulate string and make everything dynamic.

Then our query in the source should return something like this:

The query will change depending of your design of the application then I don’t think sharing my query will going to help. One tip I can give you is always compare the name of the partition with the content of the column PARTITION_NAME on ALL_TAB_PARTITIONS view to see if the partition you want to create already exists to prevent an error in ODI, in fact you have 2 choices, either do this or flag the step as Ignore Error to end the step with a warning. I prefer to compare and end the step without warnings, then I compare.

Another thing to notice is that for FY20 Feb, my PARTITION_VALUE (MAX_PERIOD) is the 1 day after the last day of the month. It’s hard to see here because the calendar used by the client is all different but if you have the normal calendar and the mask you have for you PERIOD_ID is YYYYMMDD, for 2020/Feb you partition ID needs to be 20200230 (Feb ends 02/29/2020).

That’s because this is a RANGE partition, and for RANGE partition we define the partition value as VALUES LESS THAN, what means that everything under 20200230 will be FEB unless we have a smaller partition (JAN) below it. But for example, if you smallest partition is 2018/Jan, everything below that period will be inserted in the 2018/Jan partition, then you need be careful with when you’ll start to create partitions.

Another important thing to mention is that you cannot insert a partition smaller than an existing partition, just bigger. That means, if you start with 2018/Jan, it’s impossible to create a 2017 partition, the only way is to drop the table and create the table starting by 2017. A RANGE partition needs to be sequential and always growing.

Having these values returning from the SELECT on the Command on source tab, the only thing that is left is to use these variables in the statement in the command on Target tab like this:

ALTER TABLE #TABLE_NM ADD PARTITION PARTITION_NM VALUES LESS THAN (#MAX_PERIOD);

Pay attention on the values of the partitions. In this case the values are numeric, that is why I’m not using quotes. If the values were String, we need to enclosure the variable with quotes.

And that’s it, this is all we need to do to manage partitions using ODI. Because the SELECT on the source returns All tables with their partitions name and values and we pass to the TARGET statement this information, for each row in the source the target will be looped, creating the partitions you need for each table.

We are done for the fragmented tables. Now we need to manage the partitions for the Aggregated table. The approach is the exactly the same, the only thing that will change is the content of the query from the Command on Source tab and that the PARTITION now will be a LIST Partition instead of a RANGE partition.

We could have done just one query to return all values including if that was a RANGE or a LIST partition, but for the sake of simplicity is always better to split the scenarios to not over complicate something simple.

For the month level we need the query in the Source to return: