Archive for ODI 12c

ODI Hidden Gems – Target Load Order

Posted in Gems, ODI 12c, Tips and Tricks with tags , , on October 14, 2019 by radk00

Hi all!

Today’s gem is related to a very nice surprise that all ODI developers had when they were migrating from ODI 11 to ODI 12, which is the ability to load multiple target tables with the same ODI mapping object. You may have a very large mapping and “split” (with Split object) the result set into multiple targets, or you may just copy the same result to different target tables, mapping different fields in each of the targets. This post is not about what you can do when loading multiple targets (which is kind a lot due to this feature in ODI 12c), but how to control the order which those target tables are loaded.

Let’s start with the following example:

1

If you execute this mapping, this is what it will look like in Operator:

2

You can see that it loaded the tables in an order that was decided by ODI, probably in the same order that you dragged and dropped the models into the mapping. Now imagine that you want to have control over the order and need to load Table B before table A. The classical example where you would use that is when you retrieve a source dataset and you want to derive both the dimension and fact information out of it. In this case, you would want to load the dimension table first and the fact table second.

Luckily, ODI allow us to change the order, but its not too intuitive.  In the Logical tab, you need to click in any area that does not contain an object (any white area will do). This will display the “Target Load Order” option:

8

Click on the “gear” icon in the far right and a menu will popup:

4

Now you may configure it to have any order that you want, let’s say table B, C and then A. If you save and run the mapping now, this is what you get:

5

6

That’s it folks! See ya!

Advertisements

ODI Hidden Gems – Begin/End Mapping Command

Posted in ODI 12c, Tips and Tricks with tags , on October 7, 2019 by radk00

Hi all,

Today’s short post is about a simple, but very powerful feature that often is overlooked: Begin/End Mapping Command. These options are in the Physical tab and, as their name suggests, they may issue any kind of command before a mapping begins and/or after it finishes.

1

Pay close attention to the detail that they may execute ANY command from ANY technology that ODI may handle and that’s why it is so powerful. You may run anything from Oracle DML statements, a piece of Java code, trigger OS commands and so on. This gives you a lot of flexibility.

A very common example that we may use those are to “track” some mapping in a separate log table. Although you have ODI Operator that contains all the log information on it, sometimes we may get a requirement to track all the executions of a particular mapping, so people know for sure when it ran and that the logs will not be purged by accident from the Operator by someone. Let’s see how we may accomplish logging the start and end times of a execution.

Let’s start with “Begin”. First you select which technology and logical schema that command refers to. In this case, we will insert the name of the mapping, the time that it started, and which was the session number that it was assigned to in ODI.

2

Let’s do the same with “End”:

3

Let’s run the mapping. When we go to Operator, we may see that two new tasks were created, one before and another one after the main mapping:

4

We may double click it to see the code that was executed:

5

If we query the LOG_INFO table, we will see two entries, one for begin and another one for end:

6

This was a very short example as you may do way more than that. You may send emails to alert that a critical mapping has completed, you may zip and move a file after it was just loaded by the mapping, you may run an OS bat file that will prepare your enviroment before a data load and so on. These two options are a great alternative for us to get all these “small” codes inside the ODI mapping object itself and rely less on small ODI procedures.

See ya!

ODI Hidden Gems – Temporary Indexes

Posted in ODI, ODI 12c, Tips and Tricks with tags , , on September 3, 2019 by radk00

Hi all!

Today’s gem is indeed a very hidden one. ODI is known (unfortunately) to be “not intuitive” most of the times and I think that’s because we have many options that are scared across too many objects in the UI, which leads you to keep clicking on objects until you find what you need. To make things a little bit harder, you have the logical and physical tabs and each one of them has its own representation of the objects, so some of the options will be in the logical tab, some others in the physical tab. Lets talk today about the “Temporary Indexes” option that exists in some objects in the Physical tab of a mapping.

Very often you will load data from places that may not have an “index” concept, like files for example. You may create a mapping that will load any number of big files, put them in a staging area and filter/join them together to do some ETL. You may reach situations where those joins are not performing well since you don’t have an “index” on the files. Here is where ODI may help you with some Temporary Indexes creation. Let’s see this example (it’s very basic, but you will get the idea):

1

You join both files and load to a target table. When we execute this mapping, this is what we get:

2

Basically, you are loading both files to C$ tables and then you are joining those C$ tables when loading to the target.

7

If you think that this join would benefit from an index creation, you may configure ODI to create temporary indexes on that join. To do that, you will need to go to physical tab, click the “join” object and go to “Properties” as below. After you know where the option is, it seems pretty easy and obvious, however if you are not familiar with ODI, you will need some practice to actually figure out that you need to go to physical tab and then click on the specific object to get some unique properties of that object:

3

Change Index Type to one of the possible values (Bitmap, Non-Unique, Unique) and run the mapping again. You will see the index creation after it loads the data to C$:

4

This technique also works for filters objects. If we add a filter in any of the file columns and if you think that it would be beneficial to have an index on it, the steps to create it  would be the same as before:

5When executing, we can see all the indexes creation there:

6

If you go to Oracle documentation, there is a note there regarding using the Temporary Index creation:

  • The creation of temporary indexes may be a time-consuming operation in the overall flow. Oracle recommends reviewing execution statistics and comparing the execution time saved by the indexes to the time spent creating them.

It’s a very valid point. Most of the time we don’t need to create any temporary index, but if you end up in one of those cases that you need to create them, now you know where to find this option.

Thanks!

 

Building dynamic ODI code using Oracle metadata dictionary

Posted in Data Warehouse, ODI, ODI 12c with tags , , , on July 27, 2018 by radk00

Hi all, today’s post will be about how ODI can be used to generate any kind of SQL statements using Oracle metadata tables. We always like to say that ODI is way more than just an ETL tool and that people needs to start to think about ODI as being a full development platform, where you may create any kind of code that you wish there. Today I’ll describe how we may create a simple (but dynamic) merge statement between two similar tables using an ODI procedure that will read from ALL_CONSTRAINTS, ALL_CONS_COLUMNS and ALL_TAB_COLS Oracle tables to figure out what to do.

This is the scenario that we will be working on: we have several stage tables that are truncated and loaded everyday with daily records from a source system. We have another set of tables that are used to store all the historical information and the process uses the first stage tables as sources, merging its data against the historical tables using their primary key. This is very common in a lot of places where we have a stage data layer that stores daily data pulls and then a “base” data layer that stores the historical data. In this scenario that we will describe here, both source and target set of tables have very similar structures, with the same column names, data types and so on.

Using the conventional ODI development process, we would need to create one mapping object for each set of source->target tables, so if we have 50 sources that needs to be merged against 50 targets, we would need to create 50 ODI mappings. Since the set of source->target tables are similar in this case, we may be smarter and create an ODI process that will receive a table name as a input parameter (in this case the target table name) and it will create a merge statement against those two tables in a dynamic way using Oracle metadata dictionary.

For those that are not familiar with Oracle metadata dictionary, its nothing more than a set of tables that exists in Oracle database that contains information about its existing components like, information about its tables, what are the columns that they have, which is their data type and so on. This is a great resource place that ODI may read from it and build generic code using its results. Let’s see how it looks like with a real example.

Imagine that you have two tables with the following structure:

As you can see, our base table is almost the same as our stage table and the only difference is that it contains 2 additional columns named INSERT_DTTM and UPDATE_DTTM that will be used as “control columns” to identify when that data was inserted/updated in our base table. For ODI to figure out which columns are presented in which table, we may query ALL_TAB_COLS in Oracle filtering its table name, as below:

3

This is showing us all the table columns that this table contains. Similarly, if we query ALL_CONSTRAINTS and ALL_CONS_COLUMNS, we may get all the table constraints (like Primary Key) with all its associated columns:

4

With those two sets of data, we may create a SQL that will build our dynamic merge statement. To make it easier, I’ll show you the final SQL statement now, that is divided in two pieces, and then I’ll explain each of them:

WITH TABLE_PARAMS AS
(
SELECT 
    'BASE_TABLE_A' AS TABLE_NAME,
    'SCHEMA_A' AS TABLE_OWNER
FROM DUAL
),
TABLE_PK AS
(
SELECT
    ACC.OWNER, 
    ACC.TABLE_NAME, 
    ACC.COLUMN_NAME
FROM  ALL_CONSTRAINTS AC, 
      ALL_CONS_COLUMNS ACC,
      TABLE_PARAMS 
WHERE 1=1
AND AC.OWNER = ACC.OWNER
AND AC.TABLE_NAME = ACC.TABLE_NAME
AND AC.OWNER = TABLE_PARAMS.TABLE_OWNER
AND AC.TABLE_NAME = TABLE_PARAMS.TABLE_NAME
AND AC.CONSTRAINT_NAME = ACC.CONSTRAINT_NAME
AND AC.CONSTRAINT_TYPE = 'P'
)
,
MAIN_TAB_COLS AS
(
SELECT 
    ATC.OWNER,
    ATC.TABLE_NAME,
    ATC.COLUMN_NAME
FROM ALL_TAB_COLS ATC,
     TABLE_PARAMS 
WHERE 1=1
AND ATC.TABLE_NAME = TABLE_PARAMS.TABLE_NAME
AND ATC.OWNER = TABLE_PARAMS.TABLE_OWNER
AND ATC.COLUMN_NAME NOT IN ('INSERT_DTTM','UPDATE_DTTM')
AND ATC.COLUMN_NAME NOT IN (SELECT COLUMN_NAME FROM TABLE_PK)
)
SELECT
    MTC.TABLE_NAME AS TARGET_TABLE,
    REPLACE(MTC.TABLE_NAME,'BASE_','STG_') AS SOURCE_TABLE,
    PK_ST_LIST,
    PK_S_LIST||','||(LISTAGG('S.'||MTC.COLUMN_NAME ,',') WITHIN GROUP (ORDER BY MTC.COLUMN_NAME)) || ',SYSDATE,SYSDATE' AS TABLE_S,
    PK_T_LIST||','||(LISTAGG('T.'||MTC.COLUMN_NAME ,',') WITHIN GROUP (ORDER BY MTC.COLUMN_NAME)) || ',T.INSERT_DTTM,T.UPDATE_DTTM' AS TABLE_T,
    LISTAGG ('T.'||MTC.COLUMN_NAME||'=S.'||MTC.COLUMN_NAME , ',')  WITHIN GROUP (ORDER BY MTC.COLUMN_NAME ) AS ST_COLS
FROM MAIN_TAB_COLS MTC, 
    (SELECT 
        TP.OWNER,
        TP.TABLE_NAME,
        LISTAGG ('T.'||TP.COLUMN_NAME||'=S.'||TP.COLUMN_NAME , ' AND ')  WITHIN GROUP (ORDER BY TP.COLUMN_NAME ) PK_ST_LIST,
        LISTAGG ('S.'||TP.COLUMN_NAME, ',')  WITHIN GROUP (ORDER BY TP.COLUMN_NAME ) PK_S_LIST,
        LISTAGG ('T.'||TP.COLUMN_NAME, ',')  WITHIN GROUP (ORDER BY TP.COLUMN_NAME ) PK_T_LIST
    FROM TABLE_PK TP
    GROUP BY 
        TP.OWNER,
        TP.TABLE_NAME
    ) TP
WHERE 1=1
AND MTC.OWNER = TP.OWNER
AND MTC.TABLE_NAME = TP.TABLE_NAME
GROUP BY
    MTC.OWNER,
    MTC.TABLE_NAME,
    PK_ST_LIST,
    PK_S_LIST,
    PK_T_LIST;

The first piece of the SQL contains a WITH clause with three sections:

  • TABLE_PARAMS: used to “receive” the table name parameter that will be sent by ODI;
  • TABLE_PK: used to figure out which is the table PK that will be used to do a “merge” in our target table;
  • MAIN_TAB_COLS: used to retrieve all columns that exists in a table removing any kind of control columns (in this case INSERT_DTTM and UPDATE_DTTM) and any column that is already presented in the PK columns;

The second piece is the main one where we will use the three sub selects from the WITH section to build what we need. In this case, it will return the following columns:

  • TARGET_TABLE: name of the target table that will be merged;
  • SOURCE_TABLE: name of the source table that will be used as a source of the merge process;
  • PK_ST_LIST: PK columns list that will be used on merge process;
  • TABLE_S: column names from the source table;
  • TABLE_T: column names from target table;
  • ST_COLS: combination of source and target columns for update process inside the merge;

When we run the SQL for our tables in this example, this is the result:

5

Now we have all information that we need to create a dynamic merge statement for any set of similar tables, but how do we use it in ODI? This is very simple with one of the best features that ODI has (if you read our blog, you know that we just love it): command on source/target. Let’s create a procedure and add our SQL statement in the command on source tab:

6.png

In our command on target tab, we will add the following code there:

7

As you can see, this SQL contains a lot of variables in it. These variables will be used at runtime to receive the return values from the SQL in command on source. In this way, we don’t need to worry about creating 50 mappings to do 50 merge processes. Instead, we have one procedure that will receive a table name as a parameter and will build the necessary SQL accordingly. Let’s see how it looks like in an ODI package:

8

As you can see, it’s a very simple package that is receiving a table name as a parameter and then building/running a dynamic merge SQL. This package can be called by an external package that may run it N times with different table names (like doing 50 table mergers with one single procedure). Of course, that this was just one example of a simple merge task, but it shows you the main idea of having ODI building the code for you. You may add more tasks to your procedure to create temp tables, run gather statistics and so on. There are almost no limits on what you may do using this kind of technique.

I hope you have enjoyed! See ya!

ODI 12c Standalone Agent Install for an ODI 11g guy

Posted in InfraStructure, Install, ODI, ODI 11g, ODI 12c, ODI Architecture with tags , , , , , on July 17, 2017 by radk00

Hi everybody! Today’s post is about installing an ODI 12c standalone agent. This is not a “new” topic and the steps to perform it can also be found at the Oracle site, however it got me a little bit “off guard” when I was requested to install one and the reason is that it changed considerably comparing to ODI11g (and yeah, we still work A LOT with ODI11g, so installing ODI12c agent was “new” for us).

Prior to ODI 12 version, the ODI agent was configured by simply editing a file called odiparams.bat (odiparams.sh in Linux), which would contain all the necessary agent configuration parameters. It was a simple step, where you would enter the ODI master/work configuration, DB/ODI connection users and so on. After that, you would simply run the agent program and that was it, very short and easy to do. However, in ODI 12 version, it changed considerably and now we need to go through two wizard setups, one for creating the necessary pre-requisite DB schema for ”Common Infrastructure Services” and the other one to configure the ODI Standalone agent for us.

This change added some extra complexity to an architecture that was (talking exclusively about ODI Standalone Agent here) very simple to setup in the old days. Although Oracle provides wizards for us to minimize this effort, nothing was easier than simply configuring a parameter file and running a java program. But enough grumbling, let’s see how we may accomplish this task on ODI 12.

The first wizard that we need to run is the Repository Creation Utility (RCU) that is located here at ORACLE_HOME/oracle_common/bin/rcu.bat. Before we run it, we must understand what RCU is and what it can do for us. As its name suggests, it is a utility that may be used to create any repository component required for Oracle Fusion Middleware products, including the ODI Master/Work repository.

In our project, we did not create ODI Master/Work repository with RCU, but instead we got two empty Oracle DB schemas and installed ODI directly there. The reason why we did not use RCU in this situation is because RCU will force you to create one single Oracle DB schema that will store both ODI Master and Work repositories and this is not a good approach when dealing with large environments. We think that Oracle’s rational on this subject was to simplify certain ODI installs by unifying all in a single place, but again, this removes some of the ODI’s architecture flexibility and complicates the use of complex architectures in the future, like using multiple Work repositories attached to one Master.

So, if we already have ODI Master/Work repositories created, why do we still need RCU? This is because, from ODI 12 version on, we need a third Oracle DB schema that will be used to store the “Common Infrastructure Services” tables that are required for the ODI Standalone agent and the only way to create these tables are using the RCU utility.

Now that we have set our expectations around RCU, let’s run it. The first screen is just a welcome screen explaining what RCU is about, so just click Next.

1

Now let’s select “Create Repository” and “System Load and Product Load”. Just notice that you will be asked for a DBA user in the next steps, since this DBA user will be used to create the necessary database objects (including the DB schema itself) in the new “Common Infrastructure Services” schema. Click Next.

2

Add the database and DBA information and click next.

3

ODI installer will check your information and if everything is ok, all tasks will be green. Select Ok to proceed.

4

In the next screen is where we may select which components we want RCU to install. We may notice that RCU is able to create several schemas for different components, from ODI to WebLogic. Since we already have our Master and Work repositories created, we just need to select “AS Common Schemas”/”Common Infrastructure Services”. Note here that, for this schema, RCU will create it using what is added in the “Create new prefix” option plus a “_STB” postfix. Click Next.

5

The installer will check the pre-requisites to install and if it is ok, a green check will appear. Click OK.

6

In the next screen you will identify which schema password will be used on the new created DB schema. Add a password and click next.

7

Define the Default and Temp table spaces that will be used by the new schema and click Next.

8

If the table spaces does not exist, they will be created for you. Click Ok.

9

The installer will check once more if everything is okay and also create the necessary table spaces. Click Ok.

10

On the next page, we are going to have a Summary on what the installer will do. If everything looks correct, click Create to create the necessary DB objects.

11

Check the Completion Summary, click close and that’s it! You have successfully created the “Common Infrastructure Services” schema, which is a pre-requisite for the ODI Agent install.

12

The next step is to run the wizard setup that will configure the ODI Standalone agent for us. Run the Config program on ORACLE_HOME/oracle_common/common/bin/config.cmd. In the first screen let’s create a new domain. In this domain folder is where the ODI Agent batch programs will reside, such as Start/Stop agent. Select a meaningful folder and click next.

13

In the next screen you will select “Oracle Data Integrator – Standalone Agent – 12.2.1.2.6 [odi]” and click next. This step will also install some basic Standalone components required for the ODI Agent.

14

Select a valid JDK location and click next.

15

Since we did not create our Master and Work repositories using RCU, we won’t be able to use the “RCU Data” option for Auto Configuration here. It is not a big deal, since we may select “Manual Configuration” and click next.

16

Here we will need to input all the information related to two schemas: The ODI Master and the “Common Infrastructure Services“. The way that this screen works is tricky and confusing, since there are options that may be typed for all schemas at once. The best way to do it without any mistake is by selecting one of them, add all information, then uncheck and check the other one and add all the information again. Click next.

17

The installer will check the information that was added here and if it is okay, two green marks will be showed in the Status column. Click next.

18

The next screen will be used to define our ODI Agent name. Create a meaningful name here, since this will be used by the ODI users to select on which ODI agent they will run their ETL processes. Click next.

19

Add the server address, the port and an ODI user/password that has “Supervisor” access. On preferred Data source option, leave it as odiMasterRepository and click next.

20

Although we are not going to use our ODI Standalone Agent in a Node Manager object, which would be controlled by WebLogic, we still need to select a type for it and create a new credential. Add any name and a password for it (don’t worry, you will not use it for the ODI Standalone Agent) and click next.

21

Review the install summary and if everything is ok, just click Create.

22

Check all the steps as they turn into green checks and once completed, click next.

23

That’s the end of the configurations! You have successfully completed the ODI Standalone agent configuration and it is ready to run.

24

In order to run the ODI agent, open a CMD command, navigate to your base domain folder and run the ODI Agent start program with its name as an input argument: agent.cmd –NAME=DEV_AGENT. Wait a little bit for it to load and when its status gets to “started” it is good to go.

25

Now that the ODI agent is up and running, we may go to ODI Topology/Agent and double click the ODI agent that you have created. Now we may click on the Test button and see what happens. If everything is correct, you will see an information windows saying that the ODI agent Test was Successful!

26

Congratulations, now you have an ODI12c Standalone Agent configured. As you can see, we now have some more extra steps to do compared to ODI11g. I hope this post helps you to get prepared for this new kind of installs.

Thanks, see ya!

 

ODI 12c new features: Dimension and Cubes! Part 4 (Loading using Surrogate Keys)

Posted in Dimensions, ETL, ODI 12c, ODI Architecture, ODI Mapping, Oracle, Tips and Tricks with tags , , , , , on December 16, 2016 by RZGiampaoli

Hi guys how are you?

Today we’ll continue the dimension and cubes series (Part 1, Part2 and Part 3 here) and we’ll see how to load data using Surrogate keys.

After all the setting done in the last post, now the only thing left is to create the interfaces and map everything. For the Surrogate keys, the interface and the mapping are exactly the same as for no-surrogate version (as we can see in the previous posts) for both, dimensions and facts, what’s very nice.

times-surrogate-interfaceThe interesting here is what he does behind the scenes. In the no-surrogate version ODI created one mapping for each hierarchy and in the end it merged everything together inside a table.

no-surrogate-time-operatorFor the Surrogate key version, ODI also generates one mapping for each hierarchy but the main difference is that after each one he merges it witch the others. This happens because he needs to get the surrogate key for each level.

time-surrogate-operator

For each level ODI automatically generates an insert into that level stage table verifying if all the columns does not exists in the target table (He does that to decrease the amount of data for the merge step since merge would insert or update everything and would take more time than necessary).

After the stage table is loaded the next step is to merge the stage table to the target table, and for that ODI just create a “Merge”: when match he updates the descriptions or attributes and when doesn’t match it inserts the new rows with the sequences for the SK.

In the next level of the hierarchy ODI repeats the process but joining the Year with the Quarter. ODI will keep doing this for each level mapped until the last one, where instead of having a merge with matches and not matches, he just do a merge with Matches (since he know everything will already be there).

The results will be this:

time-surrogate-table-results

It’s nice that ODI already creates the dimension thinking in an aggregated fact since we can see that he has some rows just with the year, other with the year and quarters and the last one with all the information.

One thing to notice is that the PK is the same as the Month SK. This is because ODI is ready to create SCD type 2 (we’ll do another post to show how it works).

For the fact, the mapping will still be the same as the No-surrogate version and again the difference will be in the results.

fact-surrogate-interface

We can see that in the operator ODI does something really neat this time.

fact-surrogate-operator

MERGE INTO EPM_HPT_ODI_RUN.S_FACT FACT_SURROGATE1_FACT_SURROGATE USING
(SELECT TIME_SURROGATE_FACT_SURROGAT_1.MONTH_SK AS ID_TIME ,
PRODUCT_SURROGATE_FACT_SURRO_1.PRODUCT_SK AS ID_PRODUCTS ,
REGIONS_SURROGATE_FACT_SURRO_1.CITY_SK AS ID_REGIONS ,
SRC_ERP.SALES AS METRIC
FROM ((EPM_HPT_ODI_RUN.SRC_ERP SRC_ERP
LEFT OUTER JOIN
(SELECT TIME_SURROGATE_FACT_SURROGATE.ID_MONTH AS ID_MONTH ,
TIME_SURROGATE_FACT_SURROGATE.MONTH_SK AS MONTH_SK ,
TIME_SURROGATE_FACT_SURROGATE.TIME_PK AS TIME_PK
FROM EPM_HPT_ODI_RUN.S_TIME TIME_SURROGATE_FACT_SURROGATE
WHERE ((TIME_SURROGATE_FACT_SURROGATE.TIME_PK = TIME_SURROGATE_FACT_SURROGATE.MONTH_SK)
AND (TIME_SURROGATE_FACT_SURROGATE.MONTH_SK IS NOT NULL) )
) TIME_SURROGATE_FACT_SURROGAT_1
ON (SRC_ERP.ID_MONTH = TIME_SURROGATE_FACT_SURROGAT_1.ID_MONTH) )
LEFT OUTER JOIN
(SELECT PRODUCT_SURROGATE_FACT_SURROGA.ID_PRODUCT AS ID_PRODUCT ,
PRODUCT_SURROGATE_FACT_SURROGA.PRODUCT_SK AS PRODUCT_SK ,
PRODUCT_SURROGATE_FACT_SURROGA.PRODUCTS_PK AS PRODUCTS_PK
FROM EPM_HPT_ODI_RUN.S_PRODUCTS PRODUCT_SURROGATE_FACT_SURROGA
WHERE ((PRODUCT_SURROGATE_FACT_SURROGA.PRODUCTS_PK = PRODUCT_SURROGATE_FACT_SURROGA.PRODUCT_SK)
AND (PRODUCT_SURROGATE_FACT_SURROGA.PRODUCT_SK IS NOT NULL) )
) PRODUCT_SURROGATE_FACT_SURRO_1
ON (SRC_ERP.ID_PRODUCT = PRODUCT_SURROGATE_FACT_SURRO_1.ID_PRODUCT) )
LEFT OUTER JOIN
(SELECT REGIONS_SURROGATE_FACT_SURROGA.ID_CITY AS ID_CITY ,
REGIONS_SURROGATE_FACT_SURROGA.CITY_SK AS CITY_SK ,
REGIONS_SURROGATE_FACT_SURROGA.REGIONS_PK AS REGIONS_PK
FROM EPM_HPT_ODI_RUN.S_REGIONS REGIONS_SURROGATE_FACT_SURROGA
WHERE ((REGIONS_SURROGATE_FACT_SURROGA.REGIONS_PK = REGIONS_SURROGATE_FACT_SURROGA.CITY_SK)
AND (REGIONS_SURROGATE_FACT_SURROGA.CITY_SK IS NOT NULL) )
) REGIONS_SURROGATE_FACT_SURRO_1
ON (SRC_ERP.ID_CITY = REGIONS_SURROGATE_FACT_SURRO_1.ID_CITY)
) MERGE_SUBQUERY ON ( FACT_SURROGATE1_FACT_SURROGATE.ID_TIME = MERGE_SUBQUERY.ID_TIME AND FACT_SURROGATE1_FACT_SURROGATE.ID_PRODUCTS = MERGE_SUBQUERY.ID_PRODUCTS AND FACT_SURROGATE1_FACT_SURROGATE.ID_REGIONS = MERGE_SUBQUERY.ID_REGIONS )
WHEN NOT MATCHED THEN
INSERT
(
ID_TIME ,
ID_PRODUCTS ,
ID_REGIONS ,
METRIC
)
VALUES
(
MERGE_SUBQUERY.ID_TIME ,
MERGE_SUBQUERY.ID_PRODUCTS ,
MERGE_SUBQUERY.ID_REGIONS ,
MERGE_SUBQUERY.METRIC
)
WHEN MATCHED THEN
UPDATE SET METRIC = MERGE_SUBQUERY.METRIC

He automatically joins all our dimensions at level zero (since we have the dimensions in the higher levels for the aggregated fact) to get the surrogate key information and use it in the fact table. This is very nice because in large DWs we’ll have tons of dimensions, and map/join everything is very time consuming. The final results is this:

fact-surrgoate-sql-results

A perfect DW created using surrogate key, in other words, instead of having the dimensions PKs in the fact table we have the SKs (that ware generated by a sequence in the dimensions).

In resume, we think that if you going to create simple dimensions and simple facts (without surrogate key or SCD type 2) it’s still nice to use this new feature since it’s a nice way to document and standardize your DW, but if we measure by development time it’s not worthy since it’s very time consuming for simple DW.

Now, if you want to create a DW using surrogate keys or SCD type 2 we found this new feature extremely useful for both, documentation and standardizations and because is a lot faster than do manually.

Thanks and see you soon.

ODI 12c new features: Dimension and Cubes! Part 3 (Settings for Surrogate Keys)

Posted in Cubes, Dimensions, ETL, ODI, ODI 12c with tags , , on November 24, 2016 by radk00

Hi all! First of all, sorry for the delay. We really wished to have published the rest of this series earlier, but we are overwhelmed by projects, which keep us very busy. So let’s not waste time and go directly to what matters. I really recommend you to ready part 1 and part 2 (if you didn’t already) because we will assume some things here that were already done, so we don’t keep repeating ourselves.

Today’s post is how to setup ODI dimension objects to work with Surrogate Keys. In the first post we said that there was a bug in ODI 12c that was preventing us to create dimensions with SKs. We opened an SR with Oracle and it turned out that it was not a bug, but it was some missing configurations that were not enabling us to create the objects in the right way. So, apologies to Oracle 🙂 I hope this post may explain those little specific setups, so other people does not fall on the same mistakes that we did when we tried to create these dimensions.

First let’s begin with the DB script for this example. Our source tables will remain the same as the previous example (SRC_* tables). Our stage tables will be different and we will use the STG*S tables for this example. The final dimension/fact tables will be the S* tables found below.

surrogate-script

1

Also, please create the following Native Sequences that will be used to create our SK values:

1-1

1-2

Let’s talk a little about the SK setup requirements. There are some key points that were not clear in Oracle’s documentation and that’s why we were not able to complete it successfully. After talking to Oracle Support, we got the following key requirements to make SK setup to work:

  • Each level of the dimension must have its own Natural Key and Surrogate Key columns. The SK column MUST be different to the PK of the dimension (this is very important. This was the wrong setup that we were trying to do and it was failing). This allows ODI to manage SCD type 2 changes that occur across a hierarchy (while not applicable to a Time dimension it still needs to be setup that way);
  • The dimension MUST have a Primary key defined on it;
  • Each staging table for each level MUST include all the attributes of any level above it in the hierarchy (MONTH must have all attributes of QUARTER and YEAR). The easiest way to accomplish this is to just create the staging tables to have all the attributes of the dimension. (But you may create only the needed ones. The scripts in this post only contain the necessary attributes);

Let’s get as example S_TIME table. It contains the following columns:

2

S_TIME has three levels and for each level we are going to have:

  • One attribute for each member name (YEAR, QUARTER and MONTH);
  • One ID (that will be setup as Natural Keys) for each member level (ID_YEAR, ID_QUARTER and ID_MONTH);
  • One SK for each member level (YEAR_SK, QUARTER_SK and MONTH_SK);
  • And finally the tables PK – TIME_PK;

After you run this ODI component (in our fourth post), you will notice that some information gets replicated on IDs and SKs. It may seem odd for you, but it is actually correct, since those objects are prepared to handle SCD2 type of data, so even if you don’t use it right now, you’ll need to setup them this way on your ODI dimensions (the good thing is that, if you decide later on to use SCD2, then the setup will be already done for you).

Now let’s create the TIME_SURROGATE dimension as below:

3

For level Month, do the following setup:

4

Quarter:

5

Year:

6

On Hierarchies tab, do the following setup:

7

For the other two dimensions, the process is very similar, so I’ll not add screenshots here. For the Cube setting, it is exactly as we did for the cube in the first post:

8

9

And that’s it, we are ready to load those components using Mappings. Our fourth post will show you the differences when using SK models and the benefits that it may bring to you.

See you soon!