Archive for Dimensions

ODI 12c new features: Dimension and Cubes! Part 4 (Loading using Surrogate Keys)

Posted in Dimensions, ETL, ODI 12c, ODI Architecture, ODI Mapping, Oracle, Tips and Tricks with tags , , , , , on December 16, 2016 by RZGiampaoli

Hi guys how are you?

Today we’ll continue the dimension and cubes series (Part 1, Part2 and Part 3 here) and we’ll see how to load data using Surrogate keys.

After all the setting done in the last post, now the only thing left is to create the interfaces and map everything. For the Surrogate keys, the interface and the mapping are exactly the same as for no-surrogate version (as we can see in the previous posts) for both, dimensions and facts, what’s very nice.

times-surrogate-interfaceThe interesting here is what he does behind the scenes. In the no-surrogate version ODI created one mapping for each hierarchy and in the end it merged everything together inside a table.

no-surrogate-time-operatorFor the Surrogate key version, ODI also generates one mapping for each hierarchy but the main difference is that after each one he merges it witch the others. This happens because he needs to get the surrogate key for each level.

time-surrogate-operator

For each level ODI automatically generates an insert into that level stage table verifying if all the columns does not exists in the target table (He does that to decrease the amount of data for the merge step since merge would insert or update everything and would take more time than necessary).

After the stage table is loaded the next step is to merge the stage table to the target table, and for that ODI just create a “Merge”: when match he updates the descriptions or attributes and when doesn’t match it inserts the new rows with the sequences for the SK.

In the next level of the hierarchy ODI repeats the process but joining the Year with the Quarter. ODI will keep doing this for each level mapped until the last one, where instead of having a merge with matches and not matches, he just do a merge with Matches (since he know everything will already be there).

The results will be this:

time-surrogate-table-results

It’s nice that ODI already creates the dimension thinking in an aggregated fact since we can see that he has some rows just with the year, other with the year and quarters and the last one with all the information.

One thing to notice is that the PK is the same as the Month SK. This is because ODI is ready to create SCD type 2 (we’ll do another post to show how it works).

For the fact, the mapping will still be the same as the No-surrogate version and again the difference will be in the results.

fact-surrogate-interface

We can see that in the operator ODI does something really neat this time.

fact-surrogate-operator

MERGE INTO EPM_HPT_ODI_RUN.S_FACT FACT_SURROGATE1_FACT_SURROGATE USING
(SELECT TIME_SURROGATE_FACT_SURROGAT_1.MONTH_SK AS ID_TIME ,
PRODUCT_SURROGATE_FACT_SURRO_1.PRODUCT_SK AS ID_PRODUCTS ,
REGIONS_SURROGATE_FACT_SURRO_1.CITY_SK AS ID_REGIONS ,
SRC_ERP.SALES AS METRIC
FROM ((EPM_HPT_ODI_RUN.SRC_ERP SRC_ERP
LEFT OUTER JOIN
(SELECT TIME_SURROGATE_FACT_SURROGATE.ID_MONTH AS ID_MONTH ,
TIME_SURROGATE_FACT_SURROGATE.MONTH_SK AS MONTH_SK ,
TIME_SURROGATE_FACT_SURROGATE.TIME_PK AS TIME_PK
FROM EPM_HPT_ODI_RUN.S_TIME TIME_SURROGATE_FACT_SURROGATE
WHERE ((TIME_SURROGATE_FACT_SURROGATE.TIME_PK = TIME_SURROGATE_FACT_SURROGATE.MONTH_SK)
AND (TIME_SURROGATE_FACT_SURROGATE.MONTH_SK IS NOT NULL) )
) TIME_SURROGATE_FACT_SURROGAT_1
ON (SRC_ERP.ID_MONTH = TIME_SURROGATE_FACT_SURROGAT_1.ID_MONTH) )
LEFT OUTER JOIN
(SELECT PRODUCT_SURROGATE_FACT_SURROGA.ID_PRODUCT AS ID_PRODUCT ,
PRODUCT_SURROGATE_FACT_SURROGA.PRODUCT_SK AS PRODUCT_SK ,
PRODUCT_SURROGATE_FACT_SURROGA.PRODUCTS_PK AS PRODUCTS_PK
FROM EPM_HPT_ODI_RUN.S_PRODUCTS PRODUCT_SURROGATE_FACT_SURROGA
WHERE ((PRODUCT_SURROGATE_FACT_SURROGA.PRODUCTS_PK = PRODUCT_SURROGATE_FACT_SURROGA.PRODUCT_SK)
AND (PRODUCT_SURROGATE_FACT_SURROGA.PRODUCT_SK IS NOT NULL) )
) PRODUCT_SURROGATE_FACT_SURRO_1
ON (SRC_ERP.ID_PRODUCT = PRODUCT_SURROGATE_FACT_SURRO_1.ID_PRODUCT) )
LEFT OUTER JOIN
(SELECT REGIONS_SURROGATE_FACT_SURROGA.ID_CITY AS ID_CITY ,
REGIONS_SURROGATE_FACT_SURROGA.CITY_SK AS CITY_SK ,
REGIONS_SURROGATE_FACT_SURROGA.REGIONS_PK AS REGIONS_PK
FROM EPM_HPT_ODI_RUN.S_REGIONS REGIONS_SURROGATE_FACT_SURROGA
WHERE ((REGIONS_SURROGATE_FACT_SURROGA.REGIONS_PK = REGIONS_SURROGATE_FACT_SURROGA.CITY_SK)
AND (REGIONS_SURROGATE_FACT_SURROGA.CITY_SK IS NOT NULL) )
) REGIONS_SURROGATE_FACT_SURRO_1
ON (SRC_ERP.ID_CITY = REGIONS_SURROGATE_FACT_SURRO_1.ID_CITY)
) MERGE_SUBQUERY ON ( FACT_SURROGATE1_FACT_SURROGATE.ID_TIME = MERGE_SUBQUERY.ID_TIME AND FACT_SURROGATE1_FACT_SURROGATE.ID_PRODUCTS = MERGE_SUBQUERY.ID_PRODUCTS AND FACT_SURROGATE1_FACT_SURROGATE.ID_REGIONS = MERGE_SUBQUERY.ID_REGIONS )
WHEN NOT MATCHED THEN
INSERT
(
ID_TIME ,
ID_PRODUCTS ,
ID_REGIONS ,
METRIC
)
VALUES
(
MERGE_SUBQUERY.ID_TIME ,
MERGE_SUBQUERY.ID_PRODUCTS ,
MERGE_SUBQUERY.ID_REGIONS ,
MERGE_SUBQUERY.METRIC
)
WHEN MATCHED THEN
UPDATE SET METRIC = MERGE_SUBQUERY.METRIC

He automatically joins all our dimensions at level zero (since we have the dimensions in the higher levels for the aggregated fact) to get the surrogate key information and use it in the fact table. This is very nice because in large DWs we’ll have tons of dimensions, and map/join everything is very time consuming. The final results is this:

fact-surrgoate-sql-results

A perfect DW created using surrogate key, in other words, instead of having the dimensions PKs in the fact table we have the SKs (that ware generated by a sequence in the dimensions).

In resume, we think that if you going to create simple dimensions and simple facts (without surrogate key or SCD type 2) it’s still nice to use this new feature since it’s a nice way to document and standardize your DW, but if we measure by development time it’s not worthy since it’s very time consuming for simple DW.

Now, if you want to create a DW using surrogate keys or SCD type 2 we found this new feature extremely useful for both, documentation and standardizations and because is a lot faster than do manually.

Thanks and see you soon.

Advertisements

ODI 12c new features: Dimension and Cubes! Part 3 (Settings for Surrogate Keys)

Posted in Cubes, Dimensions, ETL, ODI, ODI 12c with tags , , on November 24, 2016 by radk00

Hi all! First of all, sorry for the delay. We really wished to have published the rest of this series earlier, but we are overwhelmed by projects, which keep us very busy. So let’s not waste time and go directly to what matters. I really recommend you to ready part 1 and part 2 (if you didn’t already) because we will assume some things here that were already done, so we don’t keep repeating ourselves.

Today’s post is how to setup ODI dimension objects to work with Surrogate Keys. In the first post we said that there was a bug in ODI 12c that was preventing us to create dimensions with SKs. We opened an SR with Oracle and it turned out that it was not a bug, but it was some missing configurations that were not enabling us to create the objects in the right way. So, apologies to Oracle 🙂 I hope this post may explain those little specific setups, so other people does not fall on the same mistakes that we did when we tried to create these dimensions.

First let’s begin with the DB script for this example. Our source tables will remain the same as the previous example (SRC_* tables). Our stage tables will be different and we will use the STG*S tables for this example. The final dimension/fact tables will be the S* tables found below.

surrogate-script

1

Also, please create the following Native Sequences that will be used to create our SK values:

1-1

1-2

Let’s talk a little about the SK setup requirements. There are some key points that were not clear in Oracle’s documentation and that’s why we were not able to complete it successfully. After talking to Oracle Support, we got the following key requirements to make SK setup to work:

  • Each level of the dimension must have its own Natural Key and Surrogate Key columns. The SK column MUST be different to the PK of the dimension (this is very important. This was the wrong setup that we were trying to do and it was failing). This allows ODI to manage SCD type 2 changes that occur across a hierarchy (while not applicable to a Time dimension it still needs to be setup that way);
  • The dimension MUST have a Primary key defined on it;
  • Each staging table for each level MUST include all the attributes of any level above it in the hierarchy (MONTH must have all attributes of QUARTER and YEAR). The easiest way to accomplish this is to just create the staging tables to have all the attributes of the dimension. (But you may create only the needed ones. The scripts in this post only contain the necessary attributes);

Let’s get as example S_TIME table. It contains the following columns:

2

S_TIME has three levels and for each level we are going to have:

  • One attribute for each member name (YEAR, QUARTER and MONTH);
  • One ID (that will be setup as Natural Keys) for each member level (ID_YEAR, ID_QUARTER and ID_MONTH);
  • One SK for each member level (YEAR_SK, QUARTER_SK and MONTH_SK);
  • And finally the tables PK – TIME_PK;

After you run this ODI component (in our fourth post), you will notice that some information gets replicated on IDs and SKs. It may seem odd for you, but it is actually correct, since those objects are prepared to handle SCD2 type of data, so even if you don’t use it right now, you’ll need to setup them this way on your ODI dimensions (the good thing is that, if you decide later on to use SCD2, then the setup will be already done for you).

Now let’s create the TIME_SURROGATE dimension as below:

3

For level Month, do the following setup:

4

Quarter:

5

Year:

6

On Hierarchies tab, do the following setup:

7

For the other two dimensions, the process is very similar, so I’ll not add screenshots here. For the Cube setting, it is exactly as we did for the cube in the first post:

8

9

And that’s it, we are ready to load those components using Mappings. Our fourth post will show you the differences when using SK models and the benefits that it may bring to you.

See you soon!

ODI 12c new features: Dimension and Cubes! Part 2 (Loading using Natural Keys)

Posted in Cubes, Dimensions, ETL, New Features, ODI 12c, ODI Architecture with tags , , , , , on September 14, 2016 by radk00

Hi all, let’s continue with our posts regarding “ODI 12c new features: Dimension and Cubes”. As stated in the previous post, we can have two ways to build our new objects: with natural keys or with surrogate keys. Today’s post will focus on loading the dimensions and fact tables that where created using natural keys (please see our previous post for all the settings required for those objects).

Let’s begin loading our TIME dimension (which was mapped to our TIME Oracle table). This dimension will have information from three different source tables: SRC_YEAR, SRC_QUARTER and SRC_MONTH. Each of them has information regarding each TIME hierarchy level, so all of them needs to be loaded in order to have a complete hierarchy in our final table.

The load process is very easy and intuitive: first create a new mapping and drag and drop the TIME dimension to it. Then, just add the three source tables, map to its correspondent level in the TIME dimension and that’s it. A very cool thing here is that ODI understands each level as a “separate” table/process, so you don’t need to join your source tables before actually loading it to the target dimension. In other words, ODI allows you to have any kind of complex ETL to each dimension level and each level will be treated as “separate” data loads that will be glued together by the hierarchy setting that you mapped in the TIME dimension object. Here is what it looks like:

blog1

blog2

blog3

blog4

When you execute the mapping we are going to see that the first “MAP_BEGIN” section will try to create and truncate our stage tables that were set in our dimension object. Here is an odd thing (as we also mentioned in the last post): We could not understand yet why ODI “forces” you to have the stage tables created prior to execution (so you can select them in the Dimension object), as it could very well create them for you (like it does for C$ and I$ tables). I know that Oracle may had a reason behind it, but as for now, the entire “stage tables” thing seems an unnecessary setup. Anyway, the important thing here is that ODI will truncate the stage tables before any new execution.

blog5

In the “MAP_MAIN” section is where it gets interesting. We can see here how ODI threats this new dimension object: each level has its own ETL, as we can see that it is loading YEAR, QUARTER and MONTH separately. First YEAR step will load its source to its stage table STG_YEAR, then QUARTER step will join the information from its source table plus STG_YEAR to its STG_QUARTER table. Finally, MONTH step, that is our leaf/grain level, will join its source table plus STG_QUARTER table (which is already joined with YEAR source) and merge it all together in our final table TIME. The result will look like below:

blog6

Since we are not using Surrogate keys here, our Dimension table will contain only the grain/leaf members with all Natural Keys and its attributes for all levels that exists in the dimension. So one row will contain all information regarding all levels that it belongs to. When we create the mappings for the other two dimensions (they’re very similar, so I’m not adding them here) and execute them, we will get the following results:

blog7

blog8

Let’s to go our Fact table load. This one is way too simple, since our source table already contains all the Natural Keys that will be the ones that will also exist in our FACT table (remember, we are not dealing with Surrogate Keys in this example). Here we just need to map each NK to its respective dimension column and also our Measure data and execute the mapping.

blog9

blog10

When we take a look in Operator, we are going to see a single merge command in our Fact table, where ODI will use all dimensions to search if that row already exists in our FACT table. If it exists, the measure column is update, otherwise it is inserted.

blog11

The final result is below: as expected, all Natural Keys from our dimensions were inserted in the Fact table, together with our measure.

blog12

Now you may be wondering, why should I use these new features if it seems a lot of work (settings) for a little gain? Well, using ODI for Natural Key’s only is really not worth it, since the only benefit here seems to be ODI loading the dimensions levels all at once, with different sources/ETL, in a single mapping object, which is a very cool feature, since it enables us to better organize our DW objects and have a clear view on our ETL logic. But again, this is too little for the amount of work that we need to do to get there. But don’t worry, it will get way better when we start to work with Surrogate Keys, since ODI will be able to abstract all the Surrogate Key management and you will start to feel that all the necessary settings will finally be worth the work.

That’s it for today folks! We will be releasing the Surrogate Key settings and load posts very soon, so stay tuned in our blog! See ya!

ODI 12c new features: Dimension and Cubes! Part 1 (Settings)…

Posted in ACE, Configuration, Cubes, Dimensions, ETL, New Features, ODI, ODI 12c, ODI Architecture, ODI Mapping, Tips and Tricks with tags , , , , , , , , on August 19, 2016 by RZGiampaoli

Today we’ll talk a little bit about the new feature introduced in ODI 12.2.1.1.0, Dimension and Cubes!

As everybody already know, Oracle is slowly merging OWB within ODI and in each release we can see a new feature from OWB arriving in ODI. This time were the Dimension and Cubes feature.

This feature helps you to create a DW based in a configuration that you do. Basically there is a new component in ODI that helps you to define the datastore to be mapped. Also, after you create all dimensions (that is the most time consuming part in the process), the cube or fact table creation and mapping is a lot easier than do it manually.

Right now there is just one type of dimension available (Star schema level based dimension), but in the future other kinds will be supported like snow flake and others.

Ok, let’s start. There’re two ways to build a star dimension in ODI: with natural key’s (where the natural key is stored in the FACT table) and with surrogate keys (where the surrogate key is stored in the FACT table). In this post we’ll cover how we create a DW using the natural key process since the surrogate key one is buggy (the interface fails on saving the surrogate key) and we have openned a SR with Oracle to get it fixed. As soon we have the fix we’ll cover that too here in the blog.

In the Designer tab we can now see that we have a new tab called Dimensions and Cubes.

1-Dimension and Cubes

Opening that tab you will find a blank area, you need to click the button in the “Dimension and Cubes” tab, and you can create a new DM or DW.

2-DW creation

By the way, here’s the first small bug. For some reason when you write the name you want, ODI does not fill automatically the code field (as it always do for all the other objects in ODI), then you need to manually insert a code there. Remember, no spaces and no special character.

After that we can expand it and see the Dimension and the Cube node.

3-DW creation

Right click on those and we can create a new Dimension or Cube. As everybody knows, the dimension comes first since we need them to maintain the data integrity of the cube.

4-Dimension Definition

Here you can give any name you want for the dimension. Also you have a Pattern Name (that has just one option by now) and in the side tabs we have all possible options for the Dimension, Levels and Hierarchies, that we’ll cover later.

There are two more option here: the Datastore, that is the target dimension datastore where all metadata will flow and the Surrogate key Sequence that you need to set in case you want to create a dim using surrogate key (We’ll cover this later since we have a bug here).

In our case we’ll have three dimensions and one cube. (Time, Products, Regions and Fact). Both the source and the targets tables were generated by me with dummy data, just for this post. If you want to replicate this example, the scripts are here:

No surrogate Script

Let’s create the Time dimension. Click in the “Levels” in the left side tabs and you will see a big screen in three big sessions: Levels, Levels Attributes and Parent Level References.

5-Level Canvas

Let’s begin with the level configuration. Clicking in the Plus Sign button will create a Level.

6-Level Creation

I always like to rename the Level to something more meaningful like “Year” but if you like you can keep as default. By the default the target datastore comes automatically mapped since you define it in the previous screen. The only thing left here is to define the “Staging Datasore”.

This is something that we didn’t understood why it was made in this way since ODI could create automatically based in the definitions we had in the previous step or even with the interface configuration.

Anyway, what we need to do is create the stage tables for each level, and for that we have a few approaches we can do here:

  1. We can create another table exactly in the same way of the target table (needs to be a new table because the way ODI integrates the data. We’ll cover that latter).
  2. We can create, in this case, 3 tables, one for Year (same way as the source table is), and one for Quarter (same way of the source plus all columns from the Year table) and one for Month (same way of the source plus Quarter and Year columns).
  3. And we can duplicate the sources or the target datastore and do the changes above (in the 2 approach).

With the Stage datastores created (manually or by reverse) we just need to click in the “…” button and choose it from the list. Now we just need to repeat the step 2 more times for the other levels:

7-Level Canvas mapps

After we associate the source datastores and the stage datastores it’s time to create the attributes and ID’s for each level. For this you just need to click in the Year level and click in the Plus Sign button below:

8-Level attibutes config

Here we need to create all the attributes for this level and the natural key for that level as well. (We have the option to create slowly change dimensions here, but this will be covered in a future post!)

For each attribute you need to Plus Sign and fill the name of the attribute, set the data type (yes it not get automatically….) and select the Stage attribute (click in the “…” button and select it).

After all Attributes and ID’s we need to click in the below Plus Sign to set the natural key of that level. Just select in the list available.

After that, we just need to repeat for all the other 2 levels that we’ll have in this dimension.

With this done, the last step for this tab is to create the relationship between one level and its parent level. For this, highlight each level again, in this case we’ll start from bottom up, then let’s start clicking in the Month level and click on Plus Sign button below. Here we just need to say that for the Month level his reference parent will be Quarter. To set this we just need to select the Quarter level from the drop box and select eh foreign key from the drop box as well. Do that again for the Quarter level and reference it to the Year level. We don’t need to create any reference for the Year since it has no parent.

9 Parent Level References

As you can see, after the level configuration, everything you need to do is click in buttons and select from drop box or from “…” Screen (other than rename the defaults values if you like).
For last but not least, we need to click in the tab Hierarchies on the left tabs to enable us create a new hierarchy.

This is something fun. We can create multiple hierarchies inside the target table as well as skip level and some other features that we’ll cover in another post. For now let’s stay with a single hierarchy.

10-hierarchy

Here we need just to create the hierarchy by clicking in the Plus Sign button, give a name for the hierarchy and then click in the plus button bellow and add all the levels for the hierarchy. The order doesn’t matter, the idea here is that you can have multiple hierarchies with different levels in each one. For example, we could have a hierarchy called Full_Time with Year->Quarter->Month and another Hierarchy called Small_Time with just Year->Month. ODI would know based in the configurations we did, how to handle the data. Nice.

Also we can set skip level for each level we defined.

We are done with the dimension settings. I know it’s a lot of settings and some of you could be thinking (as we thought, this is a lot more work than if I create manually), but believe me, after you get used, you can do it in a reasonable time and the cube part is worthy.

Now we just need to repeat the process for all the other 2 dimension and them we finally start the cube settings:

11-Cube

To start the same thing as the dimension, Right click in the Cubes node and new.

12-Cube definition

In this screen we need to give a name for the cube, select a pattern name (Same as Dimension, just one option here for now) and do a biding to the target datastore.
After that we just need to click in the Detail tab in the left menu and start to configure our fact table.

12-Cube config

As I said in the beginning, here’s where the use of this components pays off. To configure a cube we just need click in the Plus Sign button and add all dimension we have, in this case our three dimensions. Then we just need to select the level we want to join our Fact table with our dimensions and bind the keys from the fact and that dimension.

For the last but not the least we just need to create by Plus Sign the measures that the Fact table will have. Same as the attributes in the dimensions: Name of the measure, Datatype and the column that will receive the data.

And that’s it. We are all set to move to the Mappings. Since this is already a huge post, I’ll stop this one now and will start a new post just for the Mappings, since I want to analyze how ODI builds the queries and loads the data there.

Hope you guys enjoy this post and see you soon.