Archive for Connect By

ORACLE SQL for EPM tips and tricks S01EP04!

Posted in ACE, Connect By, Oracle, Oracle Database, Performance, REGEXP, SQL, Tips and Tricks with tags , , , , , , on April 9, 2019 by RZGiampaoli

Continuing the Oracle SQL for EPM series, today we’ll extend a little bit the “Connect by” post to see some neat thing we can do when we mix regular expression with connect by.

First of all, let’s start explaining the basic of this approach. We can use Connect By and Level to generate data in Oracle like this:

This is very useful when you need to generate 2 rows for each row we have in a query, then you can do a Connect by Level <=2 and use it as a query without join, duplicating everything in the source query. Then you can use a Decode or Case to say, Case when level = 1 then x when level =2 then y….

But for us this is just the basic concept that we need to understand for our trick. let’s get back to our metadata table, the one with parent child relationship and with all metadata from all apps and all dims that we have. Let’s say we need to create an hierarchical table based in a parent child table. Some people does one union for each level that we have in the source table (too much work), another will do filter all over the places, I don’t, I just use connect by with regular expression. Let’s take a look.

As we can see, we have 17 levels or generations for the account dimension. quite big. First of all, let’s do a connect by path to flat the entire hierarchy in one row and remove the parent child relationship.

Now, as we can see, if we just do a connect by prior with the sys_connect_by _path, the query will generate 1 row for each level of data. We don’t want that, unless you want to have the ability to input data in the upper level of the hierarchy, that is usually not the case. What we want is to have one the entire path for each leaf member, then we need to filter just the leafs in our queries. In my case my table already have a column that says if a member has children or not, but I’ll proceed as if I don’t have it.

If we don’t have the leaf information we can generate it by using Connect_By_IsLeaf that basically return 0 if is a parent or 1 if is a leaf. Again, we are filtering this outside the connect by prior and now we have just the leafs and the entire hierarchy above him. Next step is to split this back into rows.

REGEXP_SUBSTR is the perfect tool to split the string we have into new columns. As you can see, we can use the follow expression: REGEXP_SUBSTR(Column, ‘[^|]+’, 1, 1) Basically the regexp uses what is in [] to find what yo want and split it in the way you want. In our case I’m looking for the first | (that is defined by the 1, 1) and get the first word after it. Then I have another regexp and I changed the 1, 1 to 1, 2 to get the second word and keep doing this until I get to the 17 generation (as we saw in the beginning).

Basically if you have 17 generation you need to have 17 columns in the table, then you need to have 17 REGEXP, one for each column, you just need to change the 1, 1, 1, 2, 1, 3… until the 1, 17.

Here’s when ODI come in handy because if you use ODI you just need to do a select in the source to figure out how many generations you will have and then generate dynamically the 17 REGEXP and pass this to the target dynamically. It’s very easy since the code itself is always the same and the only thing that changes is the second parameter, and it’s a sequential parameter.

I hope this can be as useful for you as it’s for me and see you soon.

Advertisements

ORACLE SQL for EPM tips and tricks S01EP02!

Posted in ACE, Connect By, EPM, Oracle Database, Performance, Query, SQL, Tips and Tricks, WITH Clause with tags , , , , , , , on March 21, 2019 by RZGiampaoli

hey guys how are you? Let’s continue the SQL for EPM series. Today I’ll continue to talk about With with a small bonus of Connect by :). let’s start.

A lot of people uses Connect By in a daily bases but as far I having seeing, most of then don’t know how to use it properly. I already lost count with the amount of people complaining about performance issue with Connect By.

The thing is, Connect By works a little different than everything else in Oracle. We can say that Connect By has 2 stages and we’ll see why I’m saying that with this example. Let’s get back to our metadata table and let’s do a Connect By to extract the Balance Sheet Hierarchy from the Juno application:

As we can see, inside this table we have more than one application and more than one hierarchies for each application. That’s ok, we just need to filter it in our SQL right?

If we filter the APP_NAME and the HIER_NAME we’ll get all accounts for that Application and this will generate 12,622 rows. By the way, this table has all metadata from all our applications and we always filter by APP_NAME and HIER_NAME to select what we want (the table is also partitioned and sub-partitioned by these 2 columns). It’s important to know that without filtering anything this table has:

Ok, now, if we want to get just the BS hierarchy we just need to do the Connect By right?

That works… perfect… or not? Well in fact, this the wrong way to use Connect by because what I said before, the 2 stages.

As you can see, this query took 25 sec just to return the first 50 rows. In a integration this will take way more time, in fact, if you join this table to a data table to do a SUM in the BS level, this will take ages to return.

The reason is that for the Connect by, first Oracle does everything that is after the word Connect by and after the word Start with and then, and only then, it does what is in the where condition. That means, first he did the connect by in those 2.260.372 rows (and they are all repeated) and then after all the processing, it filtered what we wanted, that is the APP_NAME and the HIER_NAME. Then the right way to use it is:

Now it looks way better. 0.375 seconds to do exactly the same thing as before, and the only thing I did was to move our filters to the right place. Now Oracle is filtering and doing the Connect by at same time.

Now, if you do a SYS_CONNECT_BY_PATH and want to get just the leaf (to have the complete path that the hierarchy does, you can filter the leafs in the where clause (and need to be there otherwise it’ll not have the entire hierarchy during the connect by). This is how:

Now you see that the connect by filtered what needs to be filter during the Connect by execution and afterwards, it filtered just the leafs (using the CONNECT_BY_ISLEAF that returns if a member is a leaf or not).

Also, i used the CONNECT_BY_ROOT to generate the Root member used in this query (BS) and the SYS_CONNECT_BY_PATH to generate the entire path of the metadata (Very useful to transform parent/child tables in generation tables using this Technic and a regexp [we’ll see this in another post]).

Ok, now that the “Bonus” is written, let’s talk about the WITH that was the main subject here. Even with this Connect by write in the right way with the filters in the right place, we can still improve the performance using WITH.

That’s right, the idea is to prepare our subset of data using WITH before we ask Oracle to do the Connect by and leave it as simple as possible. Let’s take a look:

This is by far the best way to use a Connect by clause. You can, instead of using WITH use a sub-query but I think this way is easier and more organised as well. Also, I know the time difference doesn’t look to big between the previous example and this one but when you join this with data and start to SUM everything, you’ll see a huge difference between this method and the previous one.

Also, some times Oracle get lost with the previous method making everything slower but with the WITH method, it never happens then I advise you start to use this.

I hope you guys enjoy this little tip and see you next time.