Archive for Tips and Tricks

ODI “Command on Source” buffered behavior

Posted in ODI, Tips and Tricks with tags , on March 28, 2019 by radk00

Hi all! This post was created based on a friend’s question to me a couple of days ago. He asked me the following:

  • I know that we may create a procedure with a SQL in “Command on Source” that would return N rows and trigger an OS command in the “Command on Target” tab for each of those rows, passing the results as a parameter. My process takes a while for each row, so I was wondering if I could insert more rows in the table that is being read on “Command on Source” while it is still executing, so it would pick the new rows as well in the same execution?

In other words, he wanted to trigger the ODI procedure once and keeps “feeding” the “Command on Source” table many times, so all his OS Commands would get executed in one procedure run. Instinctively I said no, because ODI needs to somehow “buffer” the “Command on Source” results (which may be a result of a SQL statement with different tables) and then start to run the “Command on Target” commands. He agreed with me and moved on. However, that keep in my head: what would happen if the source table somehow changes while the ODI procedure is running? What if more rows were inserted or if the table was truncated/deleted? I did some tests to make sure I gave him the right answer.

I created a procedure that contains only an ODI sleep command in the Target that will be executed for each row that comes from the source, like this:

1

Then in the Source I added the following:

2

I populated this table with 20 records and I executed the proc. As expected, it took 20 seconds to complete:

3

Then I did the following test: I executed the proc and right away I inserted 20 more rows in the target. As I thought, the procedure ended again in 20 seconds, not 40, which means that ODI really buffers the results before executing the Command on Target:

4

But this is a small number and maybe ODI can buffer all of it right away and maybe that’s why it worked. I looked in the Topology and the Array Fetch Size for this connection was set to 250:

5

I did another test with 1000 rows and decreasing the wait time from 1000 to 100 to see how it goes. Both executions (with 1000 rows and then with adding more 1000 rows in between) ended in 104 seconds (the four extra seconds may be a delay due to network, ODI usage and so on):

6

So, I changed my approach and tried something different. What if I add many rows (100,000), change the delay to ‘1’ and in the meanwhile I truncate the table? The result kind of surprised me:

7

Truncate is considered a DDL command, so that’s why the error is saying that the object does not exists anymore, although it’s still there in the DB (it does not exist in that session, as it was modified by a DDL command in another session). This test was not exactly what I wanted to test and yet it does surprises me because I was expecting that ODI would have already buffered all the results (the error happened after 51 seconds only) and it would not be sensitive to certain DDL commands. However, this may indicate that ODI does not buffer it all at once and when it tried to read the table again, it was already truncated. So, let’s do another test.

Next test I doubled the row amount and inserted 200,000 and kept the delay to ‘1’ and ran the procedure. While it was running, I tried to delete the rows instead of truncating the table. The delete ran fine, it took 18 seconds to delete (less than 51 seconds from truncate) and no errors happened this time. So, it seems that ODI buffered all the results (which were doubled) before the 18 seconds.

8

After the deleting and committing, the proc continued to run and finished around 200 seconds, as expected.

9

I did one more test to see if the time that it took to fail in the truncate test would increase if I increase the number of rows. I inserted again 200K rows (again, double amount from the previous truncate test), ran the process and truncated the table. It took the same 51 seconds. So, I believe that, although ODI can buffer all the results before 51 seconds, Oracle somehow tells the process that the table was changed by a DDL command and sends a “stop” signal to the connection from a specific amount of time. I don’t see any other explanation for this behavior.

I could run some other tests, especially to see how large is the ODI buffer size, but as for now I’m ok with the results. In resume we figure out that:

  • ODI does buffer the results in the “Command on Source/Target” and we cannot modify them once it starts;
  • Although the results are buffered, the ODI procedure may fail if DDL commands are issued to the source tables;
  • DML commands does not seem to affect the buffered results, as expected;

If any of you has done some tests like this, please share with us! This kind of things are never documented, so we need to keep testing to see how they work behind the scenes.

Thanks! See you soon!

Advertisements

ORACLE SQL for EPM tips and tricks S01EP03!

Posted in Query, SQL, Tips and Tricks with tags , , on March 26, 2019 by radk00

Hi all! Continuing the Oracle SQL for EPM series, today’s post is quite simple, but it may consume an extreme amount of time when we are requested to troubleshoot “why these numbers does not match” type of scenarios. Its related to UNION and UNION ALL operations. Let me describe what happened to me in one of those situations.

The client had a table with several columns that would calculate some metrics related to their

business. It was a “cumulative” type of table, where metrics were being aggregated by each previous period’s numbers. In a very resumed way, lets use the following example:

1

So, for Feb-19, the SUM would be 150 for Account 1 and 60 for Account 2. Next month, he would get the following:

2

His logic was summing the March period in Account 1 correctly (30) and summing it to previous 150. However, since Account 2 was not coming in March, his SQL was not reporting Account 2 in March. To make the calculations easier, he decided to add a “dummy” metric for all existing Accounts as 0, so his logic would calculate it correctly even it the record did not exist for that period. Something like that:

3

The process would still give his correct value of 30 in Account 1 for March and 0 for Account 2, which would then sum against the previous periods. It all worked fine, until someday someone complained that the numbers could not be right and some numbers were missing. When I checked the code, I quickly realized his mistake: he created his “dummy” metrics using a UNION in Oracle against his periodic metric and his “dummy” metric. But why it was giving the wrong numbers? Oracle explains:

  • UNION combines the results of two queries, which eliminates duplicate selected rows. The UNION operator returns only distinct rows that appear in either result.

Let’s picture the problem. His logic worked fine for Feb and Mar, but in Apr, something like this happened:

4

If you sum Apr period for Account 1, the number should 80, but he was getting only 60 as below:

5

This is due to UNION’s behavior: It will run an implicit distinct in the combined dataset, which in this case is eliminating good data. I went ahead and changed the UNION to UNION ALL, which Oracle states:

  • The UNION operator returns only distinct rows that appear in either result, while the UNION ALL operator returns all rows. The UNION ALL operator does not eliminate duplicate selected rows.

The result is the following:

6

Now it looks correct: 80 for Account 1 and 0 for Account 2.

That’s it folks! Simple things that may give us enormous headaches and wrong numbers, so please always check out when you see an UNION in the queries! It may be implicitly omitting some good data there.

See ya!

Guest Post – Automatic data type conversion for different technologies in ODI

Posted in Guest Post, ODI, Tips and Tricks with tags , , on January 14, 2019 by radk00

/* This is a guest post written by Eduardo Zancanella, one of our friends at DEVEPM. Enjoy and thanks Eduardo for the content!*/

Hello folks,

Hope you are having a great day today.

We would like to share a quick tricky when it comes to transport data between different technologies.

Let’s suppose we were requested to create an ETL process to migrate data from PostgreSQL to Oracle.

Our source has some columns set as TEXT, which does not exist in Oracle. To be able to perform the ETL we would need to translate it to CLOB or VARCHAR2 for instance. But how would ODI knows it to create the temporary tables accurately?

Easy peasy, go to your Physical Architecture, select PostgreSQL and check if the data type is in there. If not, create it following the steps below (if it already exists, go straight to the step 3!)

1) Right click on Data Types, New Datatype

figure01

2) Fill it up the information as below, special attention to what is highlighted:

figure02

3) Click on Converted To and set to which datatype you want it to match in your target, in this case we have chose VARCHAR2.

figure03

After getting all this setup done, let’s run through our example really quick.

Firstly, let’s reverse our source and check if the TEXT fields are in there, keep in mind that we are trying to simplify, so don’t expect to see a full picture of the tool.

figure04

Secondly, let’s create our target table, be aware that here you must use the datatype you chose on the step 3 above for any fields that will be converted. After reversing it you will see as below:

figure05

At this time, our mapping is created, LKM SQL to Oracle and IKM Oracle Incremental Update have been chosen. A quick check on how the CUSTOM_2716 field looks like:

figure06

The hint SOURCE is an extra tip, the transformation for this case has to happen before the data is inserted into the temporary tables, always keep that in mind.

Time to run!

As a first step, ODI will create the C$ and here is where the magic happens:

figure07

C$ was successfully created!

After, ODI will follow its flow and everything should be fine.

That is how we can automatically convert different datatypes among different technologies.

Thank you everyone.

Cheers!

Comparing ODI Scenario versions using SQL

Posted in 11.1.1.9.0, ODI 11g, Tips and Tricks with tags , , on October 26, 2017 by radk00

Hi all, it has being a while that we don’t post! Busy days you know…. Anyway, let’s see what we got today.

The situation that I’m about to describe often happens in large/old ODI projects. Imagine the following: you received a task to change an ODI component that was created one year ago by someone else that is not even in the company anymore. The code is running fine in PROD and the business want a small fix to it. You open the ODI package and it contains a lot of interfaces, procedures, variables, etc. You need to change the code in one single interface, which seems very simple. You change it, save it, generate a new scenario and move it to PROD. When it gets there, the job fails due to an error in another interface that you did not touch! You start to troubleshoot and figure out that someone else changed something in DEV, saved it, but did not move the code to PROD. Unfortunately this ”unwanted” code change was included by you when you generated your scenario and now the mess is already created. If you already passed through this situation, than this post may help you.

Code versioning and code migration processes in general are things that everybody knows that are necessary, but sometimes they are overlooked by the companies because people think they are too complicated or does not work very well. ODI is a big example of this, since its native versioning system is not very intuitive and most of the times does not work in the way that we want to. There are companies out there that even build their own code versioning system (outside of ODI) to manage ODI code versions. I dare to say that most of the companies don’t even have any kind of code versioning or formal code migration process for ODI at all, which causes some big headaches on similar situations as the one that I just described.

The technique that I’ll explain here is not about code versioning itself. I’ll describe something that we may use when we do not have any other way to guarantee that the scenario that we are generating was not changed by someone else during a time period. Just to let you know, all the following SQL was done in ODI 11.1.1.9 version.

Let’s begin with the basics. Everything that you create in ODI is stored in SNP tables in its WORK and MASTER repositories. For this post we will focus in two main tables from the WORK repository:

  • SNP_SCEN: contains the basic information about the scenarios that exists in that WORK repository (like name, version, creation date and so on);
  • SNP_SCEN_TASK: the “main” scenario table that contains all the steps/tasks that are performed by a scenario. You may query this table and see exactly which tasks (like SQL commands, variables, flows) that scenario will perform when you run it in Operator;

So now let’s get back to our problem. There is a scenario that is running fine in Production for one year now (let’s say it calls ODI_SCENARIO Version 1_00_00) and this scenario is also in Development. I’ll make a change in only one interface (I’ll add a simple SUBSTR in a column named PACK_SLIP) of this scenario in Development and create a new version of it (ODI_SCENARIO Version 2_00_00). How do I guarantee that my code change was the only thing that changed in this new scenario and that it does not contain any other code from other developers? The answer lies on the SNP_SCEN_TASK table.

If you go to ODI WORK repository in Production, you may run the following query to get all the steps that the scenario is currently executing on its 1_00_00 Version:

SELECT NNO,
SCEN_TASK_NO,
TASK_TYPE,
TASK_NAME1,
TASK_NAME2,
TASK_NAME3,
EXE_CHANNEL,
DEF_CONTEXT_CODE,
DEF_LSCHEMA_NAME,
DEF_CONNECT_ID,
DEF_IND_COMMIT,
DEF_ISOL_LEVEL,
DEF_PLAN_COMP,
COL_CONTEXT_CODE,
COL_LSCHEMA_NAME,
COL_CONNECT_ID,
COL_ISOL_LEVEL,
COL_IND_COMMIT,
COL_PLAN_COMP,
ORD_TRT,
IND_ERR,
LOG_LEV_DET,
IND_LOG_NB,
DEF_TECH_INT_NAME,
COL_TECH_INT_NAME,
IND_LOG_METHOD,
COL_TXT,
COL_IND_ENC,
COL_ENC_KEY,
DEF_TXT,
DEF_IND_ENC,
DEF_ENC_KEY,
IND_LOG_FINAL_CMD
FROM SNP_SCEN_TASK
WHERE SCEN_NO IN
(SELECT SCEN_NO
FROM SNP_SCEN
WHERE SCEN_NAME = 'ODI_SCENARIO'
AND SCEN_VERSION = '1_00_00')
ORDER BY SCEN_TASK_NO,NNO;

2017-10-25_17-17-51

There are a lot of important columns in this table that can give you a lot of valuable information. However, COL_TXT and DEF_TXT are generally the most important ones since they contain the code that is generated in the “Source and Target tabs” inside the procedures and interfaces. After you run this SQL in Production environment, you may export it to whatever you like. In this example here, I’ll export it as “Text” using Oracle SQL Developer as the following (Right click on any row and select “Export”):

2017-10-25_17-21-19

Save it somewhere in your computer:

2017-10-25_17-22-04

The result will be something like this:

2017-10-25_17-24-01

Now let’s run the SQL in the Development ODI WORK repository. The only thing that we will change now is our filter that will go from SCEN_VERSION = ‘1_00_00’ to SCEN_VERSION = ‘2_00_00’, which is the new scenario version that we just generated. Do the same steps as the Production SQL and you should end up with something like this:

2017-10-25_17-27-19

Now you need to compare both codes. I like to go simple and use Notepad ++ with “Compare” plugin. You may use any other tool for comparing txt files (Beyond Compare is awesome as well). In Notepad ++ you just need to open both files, click on the Production file and “Set the First Compare”, then click on de Development file and “Compare”.

2017-10-25_17-31-42

2017-10-25_17-32-54

You will have something similar to this when you compare:

2017-10-25_17-34-24

The “Compare NavBar” shows a lot of differences, way more than the one that I just did. However, we need to analyze it calmly to verify what do they really mean. You may navigate thought the changes using the “Next” button in the tool bar.

2017-10-25_17-43-00

There will be some blocks of code that contains “similar differences” due to the nature of ODI. For example, when you change one single thing in one column of one interface, it will be reflected in several steps within the Knowledge Module (in C$/I$/E$ creation for example). This is one example of it:

2017-10-25_17-49-46

This change is saying that we changed the order of PACK_SLIP column (which was the column that we added a SUBSTR command). Actually we didn’t change the order, but we changed its content. However, when ODI create its temporary tables (like C$, I$ and E$) we cannot control the order that they are going to be created as the code is generated automatically by ODI. So we don’t need to worry about this change, as it was somehow “expected”. When we click “Next”, we are going to have similar ones where the column just changed its order. Continuing further down, we will get to the place where our change occurred:

2017-10-25_17-58-43

Cool, this is the place that we changed our code and it looks good. Let’s keep going to see what else has changed. Now we will get something weird. A lot of “1” changes just appeared until the end of the file (explaining why we had a lot of changes in the comparison Navigation Bar):

2017-10-25_17-59-56

This “1” comes from IND_LOG_FINAL_CMD column, which identifies if the step should “Log Final Command” or not. This does not affect the code itself, but for the sake of my analyses I went to the KM to see if someone had changed this option:

2017-10-25_18-06-42

My suspicious was right and someone changed this option in one of the KMs, which got reflected in a lot of places in my ODI scenario. There was no more changes in my comparison, so I could conclude that:

  • PACK_SLIP changed the order in some temporary tables creation, which is ok;
  • I saw my PACK_SLIP mapping change (SUBSTR) in the Development code;
  • There was a change in the KM to “Log Final Command” in a specific KM step, which is also ok and does not affect the code itself;

No more differences were found between the scenarios, so I may safely deploy it to production. If someone else had changed something more critical, the compare method would have catch that and we could revert it back before moving to Production.

There are other ways for you to get and compare the codes, like if both scenarios are in the same DB you could just run two SQLs and compare them or you could export both XML scenario files and compare those, but this post here gives you a generic way that can be done in most of the cases and it is fairly easy to be used.

That’s it guys, I hope you have enjoyed!

Kscope 17 is approaching fast!!! And we’ll be there!

Posted in ACE, Data Warehouse, Essbase, Hyperion Essbase, Java, Kscope 17, ODI, ODI Architecture, Oracle, Performance, Tips and Tricks, Uncategorized with tags , , , , , , , , on June 8, 2017 by RZGiampaoli

Hi guys how are you? We are sorry for being away for so much time but this year we have a lot of exiting things going one, then let’s start with what we’ll be doing at Kscope 17!

This year we’ll present 2 sessions:

Essbase Statistics DW: How to Automatically Administrate Essbase Using ODI (Jun 28, 2017, Wednesday Session 12 , 9:45 am – 10:45 am)

In order to have a performatic Essbase cube, we must keep vigilance and follow up its growth and its data movements so we can distribute caches and adjust the database parameters accordingly. But this is a very difficult task to achieve, since Essbase statistics are not temporal and only tell you the cube statistics is in that specific time frame.

This session will present how ODI can be used to create a historical statistical DW containing Essbase cube’s information and how to identify trends and patterns, giving us the ability for programmatically tune our Essbase databases automatically.

And…

Data Warehouse 2.0: Master Techniques for EPM Guys (Powered by ODI)  (Jun 26, 2017, Monday Session 2 , 11:45 am – 12:45 pm)

EPM environments are generally supported by a Data Warehouse; however, we often see that those DWs are not optimized for the EPM tools. During the years, we have witnessed that modeling a DW thinking about the EPM tools may greatly increase the overall architecture performance.

The most common situation found in several projects is that the people who develop the data warehouse do not have a great knowledge about EPM tools and vice-versa. This may create a big gap between those two concepts which may severally impact performance.

This session will show a lot of techniques to model the right Data Warehouse for EPM tools. We will discuss how to improve performance using partitioned tables, create hierarchical queries with “Connect by Prior”, the correct way to use multi-period tables for block data load using Pivot/Unpivot and more. And if you want to go ever further, we will show you how to leverage all those techniques using ODI, which will create the perfect mix to perform any process between your DW and EPM environments.

These presentations you can expect a lot of technical content, some very good tips and some very good ideas to improve your EPM environment!

Also I’ll be graduating in this year leadership program and this year we’ll be all over the place with the K-Team, a special team created to make the newcomers fell more welcome and help them to get the most of the kscope.

Also Rodrigo will be at Tuesday Lunch and Learn for the EPM Data Integration track on Cibolo 2/3/4.

And of course we will be around having fun an gathering new ideas for the next year!!!

And the last but not least, this year we’ll have a friend of us making his first appearance at Kscope with the presentation OBIEE Going Global! Getting Ready for More Than +140k Users (Jun 26, 2017, Monday Session 4 , 3:15 pm – 4:15 pm).

A standard Oracle Business Intelligence (OBIEE) reporting application can hold more or less 1,200 users. This may be a reasonable number of users for the majority of the companies out there, but what happens when an IT leader like Dell decides to acquire another IT giant like EMC and all of their combined 140,000-plus users need to have access to an HR OBIEE instance? What does that setup looks like? What kind of architecture do we need to have to support those users in a fast and reliable way?
This session shows the complexity of Dell’s OBIEE environment, describing all processes and steps performed to create such environment, meeting the most varied needs from business demands and L2 support, always aiming to improve environment stability. This architecture relies on a range of different technologies to support that huge amount of end users such as LDAP & SSL, Kerberos, SSO, SSL, BigIP, Shared Folders using NAS, Weblogic running into a cluster within #4 application servers.
If the challenge was not hard enough already, all of this setup also needed to consider Dell’s legacy OBIEE upgrade from v11.1.1.6.9 to v11.1.1.7.160119, so we will explain what were the pain points, considerations and orchestration needed to do all of this in parallel.

Thank you guys and see you there!

kscope17logo-pngm

OTN Article: Building a 100% Cloud Solution with Oracle Data Integrator

Posted in ACE, ArchBeat, BICS, DBCS, DEVEPM, EPM, EPM Automate, InfraStructure, ODI, ODI 11g, ODI Architecture, Oracle, Oracle Database, OS Command, OTN, PBCS, Tips and Tricks with tags , , , , , , , , , , , , on January 23, 2017 by RZGiampaoli

Hi guys how are you? Today I want to share our new OTN article Building a 100% Cloud Solution with Oracle Data Integrator.
The article will cover how to integrate BICS, PBCS, DBCS and ODI and will explain step by step how to create a 100% cloud solution using ODI (everything on the cloud including ODI :)).

This is a perfect article for companies that are thinking to go cloud and have some doubts or even are thinking how you can integrate/use your actual infrastructure with the cloud services.

I hope you guys enjoy and see you soon.

ODI 12c new features: Dimension and Cubes! Part 4 (Loading using Surrogate Keys)

Posted in Dimensions, ETL, ODI 12c, ODI Architecture, ODI Mapping, Oracle, Tips and Tricks with tags , , , , , on December 16, 2016 by RZGiampaoli

Hi guys how are you?

Today we’ll continue the dimension and cubes series (Part 1, Part2 and Part 3 here) and we’ll see how to load data using Surrogate keys.

After all the setting done in the last post, now the only thing left is to create the interfaces and map everything. For the Surrogate keys, the interface and the mapping are exactly the same as for no-surrogate version (as we can see in the previous posts) for both, dimensions and facts, what’s very nice.

times-surrogate-interfaceThe interesting here is what he does behind the scenes. In the no-surrogate version ODI created one mapping for each hierarchy and in the end it merged everything together inside a table.

no-surrogate-time-operatorFor the Surrogate key version, ODI also generates one mapping for each hierarchy but the main difference is that after each one he merges it witch the others. This happens because he needs to get the surrogate key for each level.

time-surrogate-operator

For each level ODI automatically generates an insert into that level stage table verifying if all the columns does not exists in the target table (He does that to decrease the amount of data for the merge step since merge would insert or update everything and would take more time than necessary).

After the stage table is loaded the next step is to merge the stage table to the target table, and for that ODI just create a “Merge”: when match he updates the descriptions or attributes and when doesn’t match it inserts the new rows with the sequences for the SK.

In the next level of the hierarchy ODI repeats the process but joining the Year with the Quarter. ODI will keep doing this for each level mapped until the last one, where instead of having a merge with matches and not matches, he just do a merge with Matches (since he know everything will already be there).

The results will be this:

time-surrogate-table-results

It’s nice that ODI already creates the dimension thinking in an aggregated fact since we can see that he has some rows just with the year, other with the year and quarters and the last one with all the information.

One thing to notice is that the PK is the same as the Month SK. This is because ODI is ready to create SCD type 2 (we’ll do another post to show how it works).

For the fact, the mapping will still be the same as the No-surrogate version and again the difference will be in the results.

fact-surrogate-interface

We can see that in the operator ODI does something really neat this time.

fact-surrogate-operator

MERGE INTO EPM_HPT_ODI_RUN.S_FACT FACT_SURROGATE1_FACT_SURROGATE USING
(SELECT TIME_SURROGATE_FACT_SURROGAT_1.MONTH_SK AS ID_TIME ,
PRODUCT_SURROGATE_FACT_SURRO_1.PRODUCT_SK AS ID_PRODUCTS ,
REGIONS_SURROGATE_FACT_SURRO_1.CITY_SK AS ID_REGIONS ,
SRC_ERP.SALES AS METRIC
FROM ((EPM_HPT_ODI_RUN.SRC_ERP SRC_ERP
LEFT OUTER JOIN
(SELECT TIME_SURROGATE_FACT_SURROGATE.ID_MONTH AS ID_MONTH ,
TIME_SURROGATE_FACT_SURROGATE.MONTH_SK AS MONTH_SK ,
TIME_SURROGATE_FACT_SURROGATE.TIME_PK AS TIME_PK
FROM EPM_HPT_ODI_RUN.S_TIME TIME_SURROGATE_FACT_SURROGATE
WHERE ((TIME_SURROGATE_FACT_SURROGATE.TIME_PK = TIME_SURROGATE_FACT_SURROGATE.MONTH_SK)
AND (TIME_SURROGATE_FACT_SURROGATE.MONTH_SK IS NOT NULL) )
) TIME_SURROGATE_FACT_SURROGAT_1
ON (SRC_ERP.ID_MONTH = TIME_SURROGATE_FACT_SURROGAT_1.ID_MONTH) )
LEFT OUTER JOIN
(SELECT PRODUCT_SURROGATE_FACT_SURROGA.ID_PRODUCT AS ID_PRODUCT ,
PRODUCT_SURROGATE_FACT_SURROGA.PRODUCT_SK AS PRODUCT_SK ,
PRODUCT_SURROGATE_FACT_SURROGA.PRODUCTS_PK AS PRODUCTS_PK
FROM EPM_HPT_ODI_RUN.S_PRODUCTS PRODUCT_SURROGATE_FACT_SURROGA
WHERE ((PRODUCT_SURROGATE_FACT_SURROGA.PRODUCTS_PK = PRODUCT_SURROGATE_FACT_SURROGA.PRODUCT_SK)
AND (PRODUCT_SURROGATE_FACT_SURROGA.PRODUCT_SK IS NOT NULL) )
) PRODUCT_SURROGATE_FACT_SURRO_1
ON (SRC_ERP.ID_PRODUCT = PRODUCT_SURROGATE_FACT_SURRO_1.ID_PRODUCT) )
LEFT OUTER JOIN
(SELECT REGIONS_SURROGATE_FACT_SURROGA.ID_CITY AS ID_CITY ,
REGIONS_SURROGATE_FACT_SURROGA.CITY_SK AS CITY_SK ,
REGIONS_SURROGATE_FACT_SURROGA.REGIONS_PK AS REGIONS_PK
FROM EPM_HPT_ODI_RUN.S_REGIONS REGIONS_SURROGATE_FACT_SURROGA
WHERE ((REGIONS_SURROGATE_FACT_SURROGA.REGIONS_PK = REGIONS_SURROGATE_FACT_SURROGA.CITY_SK)
AND (REGIONS_SURROGATE_FACT_SURROGA.CITY_SK IS NOT NULL) )
) REGIONS_SURROGATE_FACT_SURRO_1
ON (SRC_ERP.ID_CITY = REGIONS_SURROGATE_FACT_SURRO_1.ID_CITY)
) MERGE_SUBQUERY ON ( FACT_SURROGATE1_FACT_SURROGATE.ID_TIME = MERGE_SUBQUERY.ID_TIME AND FACT_SURROGATE1_FACT_SURROGATE.ID_PRODUCTS = MERGE_SUBQUERY.ID_PRODUCTS AND FACT_SURROGATE1_FACT_SURROGATE.ID_REGIONS = MERGE_SUBQUERY.ID_REGIONS )
WHEN NOT MATCHED THEN
INSERT
(
ID_TIME ,
ID_PRODUCTS ,
ID_REGIONS ,
METRIC
)
VALUES
(
MERGE_SUBQUERY.ID_TIME ,
MERGE_SUBQUERY.ID_PRODUCTS ,
MERGE_SUBQUERY.ID_REGIONS ,
MERGE_SUBQUERY.METRIC
)
WHEN MATCHED THEN
UPDATE SET METRIC = MERGE_SUBQUERY.METRIC

He automatically joins all our dimensions at level zero (since we have the dimensions in the higher levels for the aggregated fact) to get the surrogate key information and use it in the fact table. This is very nice because in large DWs we’ll have tons of dimensions, and map/join everything is very time consuming. The final results is this:

fact-surrgoate-sql-results

A perfect DW created using surrogate key, in other words, instead of having the dimensions PKs in the fact table we have the SKs (that ware generated by a sequence in the dimensions).

In resume, we think that if you going to create simple dimensions and simple facts (without surrogate key or SCD type 2) it’s still nice to use this new feature since it’s a nice way to document and standardize your DW, but if we measure by development time it’s not worthy since it’s very time consuming for simple DW.

Now, if you want to create a DW using surrogate keys or SCD type 2 we found this new feature extremely useful for both, documentation and standardizations and because is a lot faster than do manually.

Thanks and see you soon.