Archive for the ODI 12c Category

Oracle SQL for EPM Tips and Tricks S01EP12

Posted in ACE, Data Extract, Hacking, Hyperion Planning, ODI, ODI 10g, ODI 11g, ODI 12c, Oracle, Oracle 11.2.0, Oracle 11.2.0.4, Oracle Database, SQL with tags , , , , , on March 31, 2020 by RZGiampaoli

Hey guys how are you keeping? I hope everybody is healthy and keep this way in this difficult times.

And to make our life less complicate, here’s another tip. Let’s talk about how to concatenate stuff in Oracle.

Imagine a simple case, we want to query the Planning repository to get the list of UDA’s a member have. We can easily do that by query the HSP_OBJECT, HSP_MEMBER_TO_UDA and HSP_UDA tables.

I’m filtering just 3 products to make it easier for us to see. The results shows that each project has a different number of UDA’s, and we never know how many it’ll be, then the easiest way to concatenate them is to use the command LISTAGG (or WM_CONCAT if you are in a DB version prior to 11.1).

The command is very simple LISTAGG(Column, Separator) WITHIN GROUP (ORDER BY column). As we can see the command allow us to select the separator we want (can be comma or any string really) as well to order the results by another column). Let’s take a look in the example above.

As you can see, it easily create a list split my comma (as specified) for me, and the nice thing about it is that I don’t need to do any string treatment if return null or if I have just one string on it and things like that.

This is an extremely good Function and we heavily use it in ODI to generate dynamic code because its simplicity, for example, we can generate a SQL statement on the fly using the command on source and command on target:

With this results we can easily pass this info to the command on target to generate a dynamic query where ODI will replace the columns we got in the target as well the table name and will also loop for each row we have in the source. This is very handy.

And for the ones that are not in the ORACLE 11.2 and ahead, we can still do that using WM_CONCAT. Is not as powerful as LISTAGG, but works pretty well. Let’s try the first example again:

I cannot show you the results since WM_CONCAT was decommissioned in the 12c (my version), but it’ll work like this. We don’t have the option to choose the separator and to make the string unique and to order by it we need to add DISTINCT in the command WM_CONCAT(DISTINCT column).

I hope that is useful and have a great day.

Oracle SQL for EPM Tips and Tricks S01EP11

Posted in ACE, Data Warehouse, Hacking, ODI, ODI 10g, ODI 11g, ODI 12c, Oracle, Oracle Database with tags , , , on March 25, 2020 by RZGiampaoli

Hey guys how are you?

Today I’ll post something that is very simple but very useful specially when working with ODI.

When we work with partitioned table we know that if we filter that table by the partitioned column Oracle will use that partition as source of data. But what if we are doing an Insert, Update or Merge?

There’s another way to explicit refer to a partition and make sure Oracle will be working inside that one and is by defining it in the From clause.

For example if I want to query the Partition “DELL_BALANCES_FY20_FEB” I can query:

As we can see, after the table name I specified the PARTITION (DELL_BALANCES_FY20_FEB) and put inside the parentheses the partition name (don’t specify as string) and that makes oracle distinct all the rows in that partition, and my Distinct of the PARTITION_KEY shows only one results as expected. (this command needs to come before the table alias).

If we are doing an Insert, Update or Merge the idea is the same:

This way we can, specially in the MERGE, make sure Oracle will be working in the right partition in the target table.

And it’s specially useful with ODI because we always know the partition we want to query or insert data when we use ODI, then we can always bind Oracle to a specific partition and make sure he’ll stay there.

I hope this is help full and see you soon.

DEVEPM in Kscope20!

Posted in Essbase, Hacking, Kscope 20, ODI 12c, ODI Architecture, Tips and Tricks with tags , , , , on March 2, 2020 by RZGiampaoli

We are delighted to tell everybody that we’ll be at KScope 20 in Boston.

We’ll be presenting the session Essbase Statistics DW: How automatically administrate Essbase using ODI on room 304, Level 3 => Wed, Jul 01, 2020 (02:15 PM – 03:15 PM). In this session we’ll talk about how to keep Essbase configuration up to date using ODI. We also will review

To have a fast Essbase cube, we must keep vigilance and follow its growth and its data movements, so we can distribute caches and adjust the database parameters accordingly. But this is a very difficult task to achieve since Essbase statistics are not temporal and only tell you the cube statistics are in that specific time frame.
This session will present how ODI can be used to create a historical DW containing Essbase cube’s information and how to identify trends and patterns, giving us the ability to programmatically tune our Essbase databases automatically.

We hope to see you all there.

Thank you and see you soon!

ODI Hidden Gems – SNP tables: Query to get executed code example

Posted in ODI, ODI 12c, ODI Architecture with tags , , on February 26, 2020 by radk00

Hi all, today’s gem is something extremely useful that I’ve being using in every single project that I work on. Today’s gem is about SNP tables, which are the database tables that are used by ODI to store its metadata. When we install ODI, the installer asks us where we want to create our Master and Work repositories. Each repository contains a set of tables with different kind of information. From Oracle documentation:

  • Master Repository: This is a data structure containing information on the topology of the company’s IT resources, on security and on version management of projects and data models. This repository is stored on a relational database accessible in client/server mode from the different Oracle Data Integrator modules.
  • Work Repository: This is a data structure containing information about data models, projects, and their use. This repository is stored on a relational database accessible in client/server mode from the different Oracle Data Integrator modules.

You may think of them being the place where contains all information about the code that was developed in ODI, all the jobs that were executed, all the source and target tables and so on. These tables may give us answers to questions like: how many mapping objects does project X have? Which are the target mappings for a specific job? How many jobs are executing in a daily basis, how long does each of them take and how much data do they manipulate (insert/delete/update)? All those questions will eventually come to you after some time, and querying the SNP tables will provide you all the answers on what is going on in your ODI projects.

Below is one example of a query that returns a lot of information regarding all the ODI executions that happened in an ODI repository in a give time frame. It gives you the name of the scenarios, versions, when it began and ended, the session status, the order that they happened and (maybe the most important) which code was executed. The last info, together with how much time it took to execute, may be very useful to analyze which are the steps that are taking longer in your environment and then do something about them.

I wont go over each table and what they mean, but you may take a look on “Doc ID 1903225.1 : Oracle Data Integrator 11g and 12c Repository Description” in Oracle support for a full list of tables and their description. In the beginning, the number of tables and attributes may look intimidating, but once you start to use them you will see that the data architechture is fairly simple and you may retrieve a lot of good information out of them.

Without further due, here is the SQL. This one was created over ODI 12.2.1. Please notice that each ODI version may have changes in the repositories tables, which may lead you to modify those queries accordingly.

SELECT
SS.SESS_NO,
SS.SCEN_NAME,
SS.SCEN_VERSION,
SS.SESS_NAME,
SS.PARENT_SESS_NO,
SS.SESS_BEG,
SS.SESS_END,
SS.SESS_STATUS,
DECODE(SS.SESS_STATUS,'D','Done','E','Error','M','Warning','Q','Queued','R','Running','W','Waiting',SS.SESS_STATUS) AS SESS_STATUS_DESC,
SSL.NNO,
SSTL.NB_RUN,
SST.TASK_TYPE,
DECODE(SST.TASK_TYPE,'C','Loading','J','Mapping','S','Procedure','V','Variable',SST.TASK_TYPE) AS TASK_TYPE_DESC,
SST.EXE_CHANNEL,
DECODE(SST.EXE_CHANNEL,'B','Oracle Data Integrator Scripting','C','Oracle Data Integrator Connector','J','JDBC','O','Operating System'
,'Q','Queue','S','Oracle Data Integrator Command','T','Topic','U','XML Topic',SST.EXE_CHANNEL) AS EXE_CHANNEL_DESC,
SSTL.SCEN_TASK_NO,
SST.PAR_SCEN_TASK_NO,
SST.TASK_NAME1,
SST.TASK_NAME2,
SST.TASK_NAME3,
SSTL.TASK_DUR,
SSTL.NB_ROW,
SSTL.NB_INS,
SSTL.NB_UPD,
SSTL.NB_DEL,
SSTL.NB_ERR,
SSS.LSCHEMA_NAME
|| '.'
|| SSS.RES_NAME AS TARGET_TABLE,
CASE
WHEN SST.COL_TECH_INT_NAME IS NOT NULL
AND SST.COL_LSCHEMA_NAME IS NOT NULL THEN SST.COL_TECH_INT_NAME
|| '.'
|| SST.COL_LSCHEMA_NAME
ELSE NULL
END AS TARGET_SCHEMA,
SSTL.DEF_TXT AS TARGET_COMMAND,
CASE
WHEN SST.DEF_TECH_INT_NAME IS NOT NULL
AND SST.DEF_LSCHEMA_NAME IS NOT NULL THEN SST.DEF_TECH_INT_NAME
|| '.'
|| SST.DEF_LSCHEMA_NAME
ELSE NULL
END AS SOURCE_SCHEMA,
SSTL.COL_TXT AS SOURCE_COMMAND
FROM
SNP_SESSION SS
INNER JOIN SNP_STEP_LOG SSL ON SS.SESS_NO = SSL.SESS_NO
INNER JOIN SNP_SESS_TASK_LOG SSTL ON SS.SESS_NO = SSTL.SESS_NO
INNER JOIN SNP_SB_TASK SST ON SSTL.SB_NO = SST.SB_NO
AND SSTL.SCEN_TASK_NO = SST.SCEN_TASK_NO
AND SSL.NNO = SSTL.NNO
AND SSTL.NNO = SST.NNO
AND SSL.NB_RUN = SSTL.NB_RUN
LEFT JOIN SNP_SB_STEP SSS ON SST.SB_NO = SSS.SB_NO
AND SST.NNO = SSS.NNO
WHERE
SS.SESS_BEG >= TRUNC(SYSDATE) - 1
ORDER BY
SESS_NO,
NNO,
SCEN_TASK_NO

See ya!

Fragmented and Aggregated tables in OBIEE using ODI Part 1/5: Creating the tables

Posted in OBIEE, ODI, ODI 12c, Oracle, Oracle 11.2.0, Oracle Database, Performance, SQL, Tips and Tricks with tags , , , , on February 7, 2020 by RZGiampaoli

Hey guys, how are you?

I normally don’t talk about OBIEE, despite the fact I work a lot with it, because there’s a lot of people talking about it. But recently I worked in a project and I decide to use fragmented tables and aggregated tables at same time and I found that we don’t have much specific information regarding this matter, and of course, zero information about using ODI to enable this project.

Then since this will be kind of big, I decide to split this in 5 parts:

Creating the tables: I’ll explain the advantages and how you can create the fragmented tables and also how to further improve performance using partitioning and sub partitioning.

Managing the partitions of the Fragmented/Aggregated tables: Here you’ll see how to manage all partitions using ODI.

Populating the Fragmented tables using ODI: Here you’ll find how to create generic components to load all fragmented tables we need to.

Populating the Aggregated tables using ODI: Here you’ll find how to create generic components to load all Aggregated tables we need to.

Setting the OBIEE Repository: Finally, we’ll going to setting up the OBIEE repository to make use of all tables.

To start the series Fragmented and Aggregated tables in OBIEE, today we are talking about how to Creating the tables for our project.

In this project I had data from 4 different source system, and each source system had different amount of data. Dell data had 20 million rows per month, EMC data 30 million, DTC data 5 million and STAT data another 5 million.

The business wanted to have 15 quarters of data stored, then if we do a simple calculation we’ll going to have: 20.000.000 + 30.000.000 + 5.000.000 + 5.000.000 * 45 = 2,700,000,000 rows.

That’s a lot of data even for a partitioned table. Then I decide to split this tables in some way.

Analyzing the data, I saw that this tables contain 3 different account hierarchies and the users will only analyze one per time: BS, Income and PL2. I decide to split the tables by it. Then just with that now I had 3 fragmented tables. JAT_JE_DETAIL_BS, JAT_JE_DETAIL_INCOME and JAT_JE_DETAIL_PL2.

That’s nice, just with that I split that 2.7 trillion rows in 3. The problem now is that I have 4 different source system right, and even if I partition ate the table by source system, EMC is as big as the other 3 sources put together, and that would make that source slower than the other to retrieve.

With that I decided to split the tables by 2 again, one for EMC and another for the other sources. That makes the data to be distribute evenly. 20.000.000 + 5.000.000 + 5.000.000 in the D table and 30.000.000 in the E table (D and E is how I called them).

With that now we have 6 partitioned tables, right? 3 for EMC data (one for each Account Hierarchy) and another 3 tables for Dell, DTC and STAT data.

Now that we have a nice and even data distribution, we need to decide how we’ll going to create the partition and sub partition for these tables.

The partitions I always like to use the PERIOD_ID because it makes sense in most of cases (unless we are creating a Forecast app, in this case the users will want to see by scenario and not by one period).

Then for these tables JAT_JE_DETAIL_D_BS_FACT, JAT_JE_DETAIL_D_INCOME_FACT, JAT_JE_DETAIL_D_PL2_FACT we are going to partition by period. But what kind of partition we will going to use? Well, for this case, this tables contains Journal information at day level and the users want to drill from the month to the Journal rows of that mouth. What that means is that we’ll always going to work inside a month, then the best way to go is using a RANGE partition.

Range partitions, how the name already says, uses a Range of values to create the partition area. In our case it’ll be one month. The range can be built by a DATE column or a Number column, and the second one is our case since is easier to manipulate the periods using a number than a data.

(The period for this project is not a regular month, for example, the month of February FY20 start in Jan/30 and goes until Feb/2).

For the sub partitions is an easy call, we’ll going to use the Source System since we have 3 different one in this tables right (Dell, DTC and STAT), and the partition type will be LIST since we have a simple list of values. Also, for sub partitions is always easier to have values that don’t increase because the complexity of adding new sub partition values to a table.

What I mean is that to add a new partition you just need to use one ALTER TABLE ADD PARTITION statement and add any number of partitions you need. But for a sub partition is different. Every time you create a sub partition Oracle create a Sub Partition template. That template is used every time you create a Partition.

In our case for example, our Sub Partition Template would be DELL, DTC and STAT. What that means is that when you create another partition, this partition will be created with these 3 sub partitions automatically. Then if you want to add another value to a sub partition you need to do 2 things instead of just one (in partition case).

You need to change the sub partition template definition to add the new sub partition and you need add your new Sub partition in every single already existing Partitions (is not update automatically). After that all new partitions will contains all your sub partitions.

If you don’t update the template, every time you create a partition, you’ll not have the new value you want.

Enough of theory, let’s proceed with the other set of tables, JAT_JE_DETAIL_E_BS_FACT, JAT_JE_DETAIL_E_INCOME_FACT, JAT_JE_DETAIL_E_PL2_FACT. For these tables we’ll continue to have the partitions as a RANGE of the PERIOD_ID but the sub partition doesn’t make sense for us to use source system because we have only one source system.

And now is the time to explain another thing about partitions. For Oracle to use the partition you need to filter the column that is used to create the partition. For example, if we are creating a partition using the column PERIOD_ID, we must filter the data by PERIOD_ID and the ideal would be always have a filter in the Partition and Sub Partition column, because if you don’t, it’ll do a full table scan anyway. Then for a data retrieval point of view, the choice for the partitions are always made based in the Filter requirement, that means, we need to choose columns that we can force the user to choose in OBIEE. For an ETL point of view, the best would be to have partitions and sub partitions in columns that we can use to truncate before load.

For example, in our first set of tables I can easily and fast truncate a period and a source system before I load it back to the tables. That’s the beauty of partitions, I can drop, truncate and create on demand.

That said, you can imagine that some times what is good for the report is not good for the DW, but this is not our case today.

With this new piece of information in mind, the users are force in the OBIEE prompts to choose a Gen 2 Account, then this will be our sub partition for our second set of tables also with the partition type as LIST.

Before we proceed, let’s just see how everything is set then:

Table NamePartitionSub Partition
JAT_JE_DETAIL_D_BS_FACTPERIOD_ID (Range)SOURCE_SYSTEM_ID (List)
JAT_JE_DETAIL_D_INCOME_FACTPERIOD_ID (Range)SOURCE_SYSTEM_ID (List)
JAT_JE_DETAIL_D_PL2_FACTPERIOD_ID (Range)SOURCE_SYSTEM_ID (List)
JAT_JE_DETAIL_E_BS_FACTPERIOD_ID (Range)ACCOUNT_GEN2_ID (List)
JAT_JE_DETAIL_E_INCOME_FACTPERIOD_ID (Range)ACCOUNT_GEN2_ID (List)
JAT_JE_DETAIL_E_PL2_FACTPERIOD_ID (Range)ACCOUNT_GEN2_ID (List)

Perfect, now the only thing that is missing here is the aggregate tables.

Aggregated table are tables that uses a dimension Drill to pre aggregate the vales that it contains. It’s an easy way to improve performance during a drill operation.

The most common aggregated table is for the Period dimension, and this is the one we’ll going to do here as well. Then in our case, as mentioned before, the 6 tables we have now contains day level data right. I want to have the data aggregated by Month and by Quarter.

What that means is that we’ll need to have one more table for each one of these 6 tables for Monthly level data and one more table for each one of these 6 tables for Quarterly level data. These tables will going to have the same Partitions and Sub Partitions definitions with only one difference, now instead of using a Range Partition, we’ll going to use a List Partition because we don’t have a range anymore, we’ll going to aggregate that range of days in one month.

Then let’s see how the tables needs to be set:

Table NamePartitionSub Partition
JAT_JE_DETAIL_D_BS_FACTPERIOD_ID (Range)SOURCE_SYSTEM_ID (List)
JAT_JE_DETAIL_D_INCOME_FACTPERIOD_ID (Range)SOURCE_SYSTEM_ID (List)
JAT_JE_DETAIL_D_PL2_FACTPERIOD_ID (Range)SOURCE_SYSTEM_ID (List)
JAT_JE_DETAIL_D_M_BS_FACTPERIOD_ID (List)SOURCE_SYSTEM_ID (List)
JAT_JE_DETAIL_D_M_INCOME_FACTPERIOD_ID (List)SOURCE_SYSTEM_ID (List)
JAT_JE_DETAIL_D_M_PL2_FACTPERIOD_ID (List)SOURCE_SYSTEM_ID (List)
JAT_JE_DETAIL_D_Q_BS_FACTPERIOD_ID (List)SOURCE_SYSTEM_ID (List)
JAT_JE_DETAIL_D_Q_INCOME_FACTPERIOD_ID (List)SOURCE_SYSTEM_ID (List)
JAT_JE_DETAIL_D_Q_PL2_FACTPERIOD_ID (List)SOURCE_SYSTEM_ID (List)
JAT_JE_DETAIL_E_BS_FACTPERIOD_ID (Range)ACCOUNT_GEN2_ID (List)
JAT_JE_DETAIL_E_INCOME_FACTPERIOD_ID (Range)ACCOUNT_GEN2_ID (List)
JAT_JE_DETAIL_E_PL2_FACTPERIOD_ID (Range)ACCOUNT_GEN2_ID (List)
JAT_JE_DETAIL_E_M_BS_FACTPERIOD_ID (List)ACCOUNT_GEN2_ID (List)
JAT_JE_DETAIL_E_M_INCOME_FACTPERIOD_ID (List)ACCOUNT_GEN2_ID (List)
JAT_JE_DETAIL_E_M_PL2_FACTPERIOD_ID (List)ACCOUNT_GEN2_ID (List)
JAT_JE_DETAIL_E_Q_BS_FACTPERIOD_ID (List)ACCOUNT_GEN2_ID (List)
JAT_JE_DETAIL_E_Q_INCOME_FACTPERIOD_ID (List)ACCOUNT_GEN2_ID (List)
JAT_JE_DETAIL_E_Q_PL2_FACTPERIOD_ID (List)ACCOUNT_GEN2_ID (List)

Great, now we have 18 tables, 6 fragmented tables for daily level data, 6 fragmented tables for monthly aggregated level data (M) and 6 fragmented tables for quarterly aggregated level data (Q).

The last thing about tables is that we need a matching Period dimension (for this case) to match the data in each aggregated level. What that means is that, in this case, since we have 3 different level of aggregations, we need to have 3 Period dimensions.

OBIEE don’t allow us to join the same table using columns that are not an ID column. Then what we need is one PERIOD dimension (DIM_PERIOD) that contains the drill from Year to day (Period ID will be YYYYQMMDD number), another PERIOD Month Dimension (DIM_PERIOD_MONTH) that contains Drill from Year to Month (Period ID will be YYYYQMM number) and one last PERIOD Quarter table (DIM_PEIOD_QUARTER) that contains Drill from Year to Quarter (Period ID will be YYYYQ number).

Of course, the Aggregated FACT tables also will going to have a matching ID (and we need that to aggregate the data right).

Last thing to mentioning is that the Detail Fact table (the one in the Day level) it contains a lot of Journal details like Journal number, Description, Header ID and others. For the detail this is ok, but when we design the aggregated tables, only ID are allowed there for maximum aggregation. If the table contains Descriptions, nothing will be going to be aggregated right?

Don’t worry that OBIEE can handle this kind of situation very well.

That’s for now. I hope you guys enjoy it and found helpful.

See you soon.

ODI Hidden Gems – Target Load Order

Posted in Gems, ODI 12c, Tips and Tricks with tags , , on October 14, 2019 by radk00

Hi all!

Today’s gem is related to a very nice surprise that all ODI developers had when they were migrating from ODI 11 to ODI 12, which is the ability to load multiple target tables with the same ODI mapping object. You may have a very large mapping and “split” (with Split object) the result set into multiple targets, or you may just copy the same result to different target tables, mapping different fields in each of the targets. This post is not about what you can do when loading multiple targets (which is kind a lot due to this feature in ODI 12c), but how to control the order which those target tables are loaded.

Let’s start with the following example:

1

If you execute this mapping, this is what it will look like in Operator:

2

You can see that it loaded the tables in an order that was decided by ODI, probably in the same order that you dragged and dropped the models into the mapping. Now imagine that you want to have control over the order and need to load Table B before table A. The classical example where you would use that is when you retrieve a source dataset and you want to derive both the dimension and fact information out of it. In this case, you would want to load the dimension table first and the fact table second.

Luckily, ODI allow us to change the order, but its not too intuitive.  In the Logical tab, you need to click in any area that does not contain an object (any white area will do). This will display the “Target Load Order” option:

8

Click on the “gear” icon in the far right and a menu will popup:

4

Now you may configure it to have any order that you want, let’s say table B, C and then A. If you save and run the mapping now, this is what you get:

5

6

That’s it folks! See ya!

ODI Hidden Gems – Begin/End Mapping Command

Posted in ODI 12c, Tips and Tricks with tags , on October 7, 2019 by radk00

Hi all,

Today’s short post is about a simple, but very powerful feature that often is overlooked: Begin/End Mapping Command. These options are in the Physical tab and, as their name suggests, they may issue any kind of command before a mapping begins and/or after it finishes.

1

Pay close attention to the detail that they may execute ANY command from ANY technology that ODI may handle and that’s why it is so powerful. You may run anything from Oracle DML statements, a piece of Java code, trigger OS commands and so on. This gives you a lot of flexibility.

A very common example that we may use those are to “track” some mapping in a separate log table. Although you have ODI Operator that contains all the log information on it, sometimes we may get a requirement to track all the executions of a particular mapping, so people know for sure when it ran and that the logs will not be purged by accident from the Operator by someone. Let’s see how we may accomplish logging the start and end times of a execution.

Let’s start with “Begin”. First you select which technology and logical schema that command refers to. In this case, we will insert the name of the mapping, the time that it started, and which was the session number that it was assigned to in ODI.

2

Let’s do the same with “End”:

3

Let’s run the mapping. When we go to Operator, we may see that two new tasks were created, one before and another one after the main mapping:

4

We may double click it to see the code that was executed:

5

If we query the LOG_INFO table, we will see two entries, one for begin and another one for end:

6

This was a very short example as you may do way more than that. You may send emails to alert that a critical mapping has completed, you may zip and move a file after it was just loaded by the mapping, you may run an OS bat file that will prepare your enviroment before a data load and so on. These two options are a great alternative for us to get all these “small” codes inside the ODI mapping object itself and rely less on small ODI procedures.

See ya!