Archive for May, 2019

ORACLE SQL for EPM tips and tricks S01EP06!

Posted in Oracle, Tips and Tricks with tags , on May 6, 2019 by Rodrigo Radtke de Souza

Hi all,

Today’s post is about two cool Oracle analytics functions that are powerful and awesome, but few people use them, which is LEAD and LAG. LEAD function lets you query more than one row in a table at a time without having to join the table to itself. It returns values from the next row in the table. LAG does the same thing but returning the previous row. It may sound weird when you just read its descriptions, so let’s get some real examples.

Imagine that we have the following data:

1

I need to do a query that shows the percentage that DATA column increased over the periods in a single row. For example, in PERIOD 2 I need to show one row with the previous and current period values and how much it increased over the period. I see in a lot of places people just querying the same table twice, joining by its key columns (in this case ACCOUNT and PERIOD) and then doing the Percentage calculation. However, we don’t need to go over all this trouble, since it is very easy to accomplish the same result using LAG function as showed below:

SELECT

ACCOUNT,


PERIOD,


LAG (DATA,1) OVER (PARTITION BY ACCOUNT ORDER BY PERIOD) as PREVIOUS_DATA,


DATA as PERIOD_DATA,


ROUND(DATA/LAG (DATA,1) OVER (PARTITION BY ACCOUNT ORDER BY PERIOD) *100,2) PERCENTAGE


FROM T$_LEAD_LAG

2

LAG and LEAD syntax are basically the same:

LAG/LEAD ( expression [, offset [, default] ] )

OVER ( [ query_partition_clause ] order_by_clause )

In our example, I’m querying the table only once and I’m “LAGing” for 1 previous row, partitioned by ACCOUNT and ordering by PERIOD. So, for each distinct account value, Oracle will order the rows by period and we will access its values as a normal column. We may do this as many times as we want, for example if we want a two-month comparison:

SELECT

ACCOUNT,


PERIOD,


DATA as PERIOD_DATA,


LAG (DATA,1) OVER (PARTITION BY ACCOUNT ORDER BY PERIOD) as PREVIOUS_PERIOD_DATA,


LAG (DATA,2) OVER (PARTITION BY ACCOUNT ORDER BY PERIOD) as PREVIOUS_TWO_PERIODS_DATA,


ROUND((DATA - LAG (DATA,1) OVER (PARTITION BY ACCOUNT ORDER BY PERIOD) )/ LAG (DATA,1) OVER (PARTITION BY ACCOUNT ORDER BY PERIOD) *100,2) PERCENTAGE_PREVIOUS_PERIOD,


ROUND((DATA - LAG (DATA,2) OVER (PARTITION BY ACCOUNT ORDER BY PERIOD) )/ LAG (DATA,2) OVER (PARTITION BY ACCOUNT ORDER BY PERIOD) *100,2) PERCENTAGE_PREVIOUS_TWO_PERIODS


FROM T$_LEAD_LAG

3

Another example using LEAD can be used to check data accuracy between “linked” rows, often seen in tables that contains SCD (Slowly Changing Dimension) behavior. Let’s get the below example:

4

In a SCD table, we have the effective start and end date for each one of the records that belongs to the same key. These dates creates a “link” between the records, where one effective date starts where another effective date ends. The above picture is an example where all records looks good, having each effective date ending and starting correctly. Now see example below:

5

We can see that there is a one-day gap between 10/08 and 11/08, which may cause a lot of trouble if the application tries to see which record was effective right between those two days (it would return null).

In order to search for those kinds of gaps between the records, we may write a simple and elegant LEAD statement that will search for all records that has a “gap” between them. The statement would look like this:

WITH ALL_ AS (

SELECT RECORD_KEY


, TO_DATE(EFF_START_DATE,'mm/dd/yyyy hh24:mi:ss') EFF_START_DATE


, TO_DATE(EFF_END_DATE,'mm/dd/yyyy hh24:mi:ss') EFF_END_DATE


, CURRENT_FLAG


, LEAD (TO_DATE(EFF_START_DATE,'mm/dd/yyyy hh24:mi:ss'),1) OVER (PARTITION BY RECORD_KEY ORDER BY TO_DATE(EFF_START_DATE,'mm/dd/yyyy hh24:mi:ss')) AS NEXT_START


FROM T$_POST
)

SELECT RECORD_KEY


, TO_CHAR(EFF_START_DATE,'mm/dd/yyyy hh24:mi:ss') EFF_START_DATE


, TO_CHAR(EFF_END_DATE,'mm/dd/yyyy hh24:mi:ss') EFF_END_DATE


, TO_CHAR(NEXT_START,'mm/dd/yyyy hh24:mi:ss') NEXT_START


FROM ALL_ WHERE (NEXT_START - EFF_END_DATE)*24*60*60 > 0

6

The result will show which record has a “gap” between its effective end date and the next effective start date. In this case I had to create the SQL using a WITH clause, because we cannot use “window” functions directly into the where clause. If we try to do it, we will get an ORA-30483 error:

7

Thanks all! I hope you have liked it! See you soon!

Advertisement

ORACLE SQL for EPM tips and tricks S01EP05!

Posted in ACE, Connect By, DEVEPM, Dimensions, EPM, ETL, Oracle, Oracle 11.2.0, Oracle 11.2.0.4, Oracle Database, Query, REGEXP, SQL, Tips and Tricks with tags , , , , on May 3, 2019 by RZGiampaoli

Continuing the Oracle SQL for EPM series today we’ll see another usage of Connect by. I’m talking a lot about connect by because is a very useful function and we should use it a lot in the analytic space.

Let’s imagine that we are working with a planning application and we want to figure out in a query what is the region of each cost center and to what top level that cost centers belong. In this app we have the Entity dimension with Support Geography as a attribute.

The support geography has the region on the generation 2 of the hierarchy. How can we do that. Well, connect by to the help.

First of all we need to get the entire physical geography hierarchy. To do that we’ll do a query like this:

This is a simple query that joins the HSP_ATTRIBUTE_MEMBER table (this table contains all the attributes from all attribute dimensions) that’s why we have that sub-query to select just the Support Geography members (1).

Then we join this table with the HSP_OBJECT that contains all metadata from everything in Planning: Forms, Dimensions, Flows, everything. We do that to get the name of the members. And finally we do a connect by to rebuild the hierarchy (2).

With the hierarchy re-built we can use the Connect by root to figure out to witch top level each member belongs. (3)

With this query we know what’s the leaf member of the Support Geography dimensions (ID) and to what Region that member belong. Now we need to do the same for the Entity dimension.

With this query we are filtering the OBJECT_TYPE=33, that means Entity (1) (If you want to know what are the possible object types you can query the HSP_OBJECT_TYPE table to check that out), and then we do a Connect By to re-built the hierarchy and then use some Connect by Root to get the parents as well the Sys connect By Path to get the Path of the hierarchy.

Now we just need to join everything under the same query to get all the Cost Centers and to witch top level each one belongs and to witch region they are part of as well.

Another thing that I like a lot about the WITH clause is that is very easy to create nested queries. In this case I created 2 different queries, each one with a WITH and a select based in that WITH. Now everything I need to do is put everything under the same with by:

We can see that now I just got the Select that was under the WITH clause and just created a nested WITH by just creating the step 1 and 2 and now I have instead of 2 WITH queries I have just 1 WITH with 4 Clauses under it being 2 of then a nested one based in the previous one (Connect by from the filtered query).

All we need to do is putting everything together by joining the ENT_HIER and the SG_HIER using the HSP_MEMBER_TO_ATTRIBUTE table. This table basically is the map between the ENTITY dimension and the Attribute dimensions, in this case Support Geography.

The results is the Cost Center, the Path of that member in the hierarchy and with this we can use the REGEXP to extract any level of the hierarchy, and finally the region that cost center belong.

You may thing why we need to do that. well, this is a query I used to join with the HSP_GROUP table to get the groups and the members from each region and then create the security dynamically for each user. That means, I have a Planning application with Entities that has cost centers from different regions and I’m using the Attribute dimension to generate the security.

If a User has access only to AMER data, it’ll only see the AMER cost centers. This is just one example of what we can achieve using Connect by.

I hope you guys enjoy. Next time we’ll talk about another very very useful function that I really love it.

See you soon guys.