Archive for the ETL Category

ORACLE SQL for EPM tips and tricks S01EP05!

Posted in ACE, Connect By, DEVEPM, Dimensions, EPM, ETL, Oracle, Oracle 11.2.0, Oracle 11.2.0.4, Oracle Database, Query, REGEXP, SQL, Tips and Tricks with tags , , , , on May 3, 2019 by RZGiampaoli

Continuing the Oracle SQL for EPM series today we’ll see another usage of Connect by. I’m talking a lot about connect by because is a very useful function and we should use it a lot in the analytic space.

Let’s imagine that we are working with a planning application and we want to figure out in a query what is the region of each cost center and to what top level that cost centers belong. In this app we have the Entity dimension with Support Geography as a attribute.

The support geography has the region on the generation 2 of the hierarchy. How can we do that. Well, connect by to the help.

First of all we need to get the entire physical geography hierarchy. To do that we’ll do a query like this:

This is a simple query that joins the HSP_ATTRIBUTE_MEMBER table (this table contains all the attributes from all attribute dimensions) that’s why we have that sub-query to select just the Support Geography members (1).

Then we join this table with the HSP_OBJECT that contains all metadata from everything in Planning: Forms, Dimensions, Flows, everything. We do that to get the name of the members. And finally we do a connect by to rebuild the hierarchy (2).

With the hierarchy re-built we can use the Connect by root to figure out to witch top level each member belongs. (3)

With this query we know what’s the leaf member of the Support Geography dimensions (ID) and to what Region that member belong. Now we need to do the same for the Entity dimension.

With this query we are filtering the OBJECT_TYPE=33, that means Entity (1) (If you want to know what are the possible object types you can query the HSP_OBJECT_TYPE table to check that out), and then we do a Connect By to re-built the hierarchy and then use some Connect by Root to get the parents as well the Sys connect By Path to get the Path of the hierarchy.

Now we just need to join everything under the same query to get all the Cost Centers and to witch top level each one belongs and to witch region they are part of as well.

Another thing that I like a lot about the WITH clause is that is very easy to create nested queries. In this case I created 2 different queries, each one with a WITH and a select based in that WITH. Now everything I need to do is put everything under the same with by:

We can see that now I just got the Select that was under the WITH clause and just created a nested WITH by just creating the step 1 and 2 and now I have instead of 2 WITH queries I have just 1 WITH with 4 Clauses under it being 2 of then a nested one based in the previous one (Connect by from the filtered query).

All we need to do is putting everything together by joining the ENT_HIER and the SG_HIER using the HSP_MEMBER_TO_ATTRIBUTE table. This table basically is the map between the ENTITY dimension and the Attribute dimensions, in this case Support Geography.

The results is the Cost Center, the Path of that member in the hierarchy and with this we can use the REGEXP to extract any level of the hierarchy, and finally the region that cost center belong.

You may thing why we need to do that. well, this is a query I used to join with the HSP_GROUP table to get the groups and the members from each region and then create the security dynamically for each user. That means, I have a Planning application with Entities that has cost centers from different regions and I’m using the Attribute dimension to generate the security.

If a User has access only to AMER data, it’ll only see the AMER cost centers. This is just one example of what we can achieve using Connect by.

I hope you guys enjoy. Next time we’ll talk about another very very useful function that I really love it.

See you soon guys.

Advertisements

ORACLE SQL for EPM tips and tricks S01EP01!

Posted in DEVEPM, ETL, Oracle, Oracle Database, Performance, SQL, Tips and Tricks, Uncategorized, WITH Clause with tags , , , , , , on January 21, 2019 by RZGiampaoli

Hey guys how are you? I decide to start a new series called ORACLE SQL for EPM tips and tricks. The idea here is to show the most useful SQL commands for EPM, how to improve performance, tips, tricks and everything that can be useful from a SQL point of view!

And to start well, I’ll show something very old but very useful that I don’t see too many people using these days. “WITH” clause.

I love to use “WITH” in my code. It helps organize the code, helps to optimize it and more important, to make it more efficient.

When you use “WITH” Oracle treats your query inside it as an inline view or resolved as a temporary table, making it easier and faster for Oracle to access that data if you need it again.

Simply putting, every time you needs to right a query that uses the same table over and over, it’ll probably be way more efficient if you use “WITH”.

The “WITH”clause works a little bit different from a regular SQL. We can say that we split the query in 2, one is the “WITH” declaration (That will behave like a table) and the other is the SQL that will Query the “WITH”.

WITH name_of_temp_table_here AS
(
    YOUR QUERY HERE
),
   name_of_temp_table_here2 AS
(
   SELECT *
   FROM name_of_temp_table_here, another_table...
)
SELECT *
FROM name_of_temp_table_here, name_of_temp_table_here2 

In the “WITH” you can have any kind of query you want. You can do joins, group by, you can also have more than one “WITH”, you can use the result of one “WITH” in the next “WITH”, you can do a lot of things.

But for now, lets take a look in a more real example. Let’s say that you have a table like I do, that contains all metadata from all yours applications:

Let’s say you want to get the Parent of a attribute that is associated with your Entity dimension. You probably will be doing something like this:

In the “FROM” we call the table 2 times and we join and filter everything we need. Since we don’t have attribute association in all levels we do a “Left Join” to make sure all Entities comes in the query. If we run a Explain Plan now we’ll get something like this:

As you can see, Oracle is querying the METADATA_EXTRACT table twice and each time it’s doing a FULL in one Partition (‘ENTITY’ and ‘PHYSICAL_GEOGRAPHY’ partitions).

Now, if we change the query (and we can do it in different ways, this is just one of them) to a “WITH” clause we ‘ll have something like this:

As you can see, we achieved the same results with the code a little bit different. Now I have all my filters in the “WITH” query and in the bottom I just call the “WITH” query 2 times and do what needs to be done.

If we run a Explain Plain now we will have:

As you can see, Oracle now is querying the METADATA_EXTRACT table just once and then his queries the SYS.SYS TEMP table twice. The only problem with this query and the way I did is that since we are creating a temporary table filtering 2 partitions and then later I’m filtering again, it’s basically doing 2 FULL scan in the same TEMP table, and even so, it’s a few seconds faster then the original query.

But this is just an example on how we can reduce the amount of times that Oracle needs to query a table. WITH is not a miracle clause or anything like that, is just another tool that we can use to optimize our code, and its performance needs to be evaluated in a case-by-case basis.

And even if the performance doesn’t change, I believe using “WITH” clause makes any query easier to ready, to test, to update and to right since you can divide your huge query in small bits and then join
everything in the bottom query.

“WITH” is a huge subject and we’ll be talking more about it in the next post, and this time we’ll be improving performance as well using “WITH” with “CONNECT BY”.

Playing with ODI and Groovy – Part 1 – Getting things ready

Posted in ETL, GROOVY, Java, ODI, ODI SDK with tags , , , on January 8, 2019 by radk00

Hi all, how are you doing? It has been a long quiet period here in the blog and the reason is always the same: too much work, projects, personal things and so on. To “force” myself in getting some time to write in the blog (while I still have the “new year” feeling), I’ll start this series of ODI and Groovy development. Not sure how many posts I’ll write, but it will be a step by step on how to create your own ODI utilities using Groovy scripts. We will start from looking on the necessary tools that we will need to use, and the final goal is to have an ODI utility to solve a specific ODI developers’ problem. Let’s start then.

So, what is the problem that we are trying to solve?

ODI developers knows that, besides all their project’s problems, they need to deal with boring/repetitive/error prone daily activities that are often underestimated by people and that may cause big issues over time in large ODI environments. One of those simple (yet boring) tasks is to keep ODI environments in sync regarding to ODI scenarios. How to make sure that all ODI scenarios in my DEV environment were already migrated to TEST or PROD? What should have been deployed but was not? How can I see a list of those differences and figure out who/when that was done?

Almost every time that I need to answer one of those questions, I go to ODI repository metadata tables and start to write down queries to get the necessary information and compare them between the different environments. Although it works, it is time consuming, its manual and I need to have access to read the ODI metadata tables, which is not possible in a lot of places due to security reasons.

So, thinking about all that, I decided to create my own ODI utility that can connect to different ODI repositories, compare what is different between them and deploy any missing scenario that I wish to deploy. Although the idea sounds simple, it’s a pretty useful tool that may save us a lot of time and it can be reused in any project that you work on. Also, it can serve as a base for you to create any ODI utility that you may want to, so you can make your job more productive and automate all the boring/manual tasks.

Ok, you have convinced me. So, what do I need to get it done?

As the title of this post mentions, you will need ODI SDK libraries (they come as part of ODI install), Groovy/Java and a tool to write down your code. I choose Eclipse IDE because I was more familiar with the tool from my past Java developer days, but you can use anything that you want. In fact, ODI already comes with a Groovy editor that you could use, however it is a very basic editor that won’t give you a lot of the cool stuff that all those modern Groovy/Java IDEs can provide to you, like code completion, automatic library imports and so on.

You mentioned Groovy/Java. Why?

That’s a good question, since some people get confused about those two languages. As I’ve written before, in the way past, I used to adventure myself in Java code as a developer, which got very limited to a few scripts now and then once I started to work with data integration. At first, I thought about creating the utilities all in pure Java (due to my background), but ODI already comes with Groovy support, so I decided to look at it. Although I knew what Groovy concept was, I had never worked with it, so I had to do some study before start dealing with it.

Groovy is (a resume from its site): a powerful, optionally typed and dynamic language, with static-typing and static compilation capabilities, for the Java platform aimed at improving developer productivity thanks to a concise, familiar and easy to learn syntax. It integrates smoothly with any Java program, and immediately delivers to your application powerful features, including scripting capabilities, Domain-Specific Language authoring, runtime and compile-time meta-programming and functional programming.

Some key points that we should take from this resume and that drove my decision to use Groovy was:

  • Simplicity and Dynamicity: you can write the same code in Groovy compared to Java with less line codes, so it gets faster for you to code and read. Since its simpler, its great for writing concise and maintainable automation tasks/scripts (which is our goal here).
  • Smooth Java integration: Seamlessly and transparently integrates and interoperates with Java and any third-party libraries, which means that it is very easy for Java developers to learn and use Groovy.

There are other benefits/drawbacks about using Groovy over pure Java, but since ODI supports Groovy and its simpler to code (specially to create small automation scripts), those seemed good compelling reasons for me to use it.

Installing Eclipse with Groovy support

I’m going to describe here the steps to have Eclipse working with Groovy, so if you are using another IDE or even coding directly in ODI Groovy editor, you may skip this part. Installing Eclipse (Eclipse IDE for Java Developers) is very straight forward. You just need to go to Eclipse site and install it. However, Eclipse is aimed for Java development, not Groovy, which needs additional steps to get it installed in Eclipse. So, I read this site and replicated step 3, installing the “Groovy Eclipse plug-in” manually. I also executed step 6 just to make sure that my Eclipse/Groovy install was correct.

post1

Connecting to ODI

Let’s create a script that simply connects to an existing ODI instance, just to validate our Eclipse/Groovy/ODI SDK installation. First, we need to import the necessary Jar files to our Groovy project in Eclipse. Right click and select “Build Path/Configure Build Path”:

post2

On Libraries, select “Add External Jars”:

post3

There are some Jars that you will need to import to make it work. Here is the list:

  • Go to “Path to your ODI install\odi\sdk\lib” and import all Jar files from that folder;
  • Go to “Path to your ODI install\oracle_common\modules\oracle.jdbc” and import ojdbc8.jar from there;
  • Go to “Path to your ODI install\oracle_common\modules” and import all javax* jar files. Those are only needed to clear some weird warning messages that appears when connecting to ODI repository using Eclipse;

Now one important step that needs to be done if you are using Eclipse and ODI SDK Jar files. Once you import the above list, click on “Groovy Libraries” and click in “Remove” as below:

post4

This “removal” will remove the Groovy Libraries that were added as part of Groovy plugin install that we did before.  This removal is needed because ODI SDK libraries already contains Groovy libraries and they may conflict if they are in different versions. Below is an example of what happens if you don’t do this removal step in Eclipse.

post6

The code to connect o an ODI instance is very simple as we can see below. It imports a few libraries, create some variables that will be used as the login information and gets authenticated in the Master/Work repository.


import java.util.logging.Logger;
import java.util.logging.Level;
import oracle.odi.core.OdiInstance;
import oracle.odi.core.config.MasterRepositoryDbInfo
import oracle.odi.core.config.OdiInstanceConfig
import oracle.odi.core.config.PoolingAttributes
import oracle.odi.core.config.WorkRepositoryDbInfo
import oracle.odi.core.security.Authentication

logger = Logger.getLogger("oracle.jdbc");
logger.setLevel(Level.SEVERE);

sourceUrl = "jdbc:oracle:thin:@YOUR_SERVER_INFO";
driver = "oracle.jdbc.OracleDriver";
sourceSchema = "DEV_ODI_REPO";
sourceSchemaPwd = "XXXXXXXX"
sourceWorkrep = "WORKREP";
sourceOdiUser = "XXXXXXXX";
sourceOdiUserPwd = "XXXXXXXX";
sourceMasterInfo = new MasterRepositoryDbInfo(sourceUrl, driver, sourceSchema, sourceSchemaPwd.toCharArray(), new PoolingAttributes());
sourceWorkInfo = new WorkRepositoryDbInfo(sourceWorkrep, new PoolingAttributes());

sourceOdiInstance = OdiInstance.createInstance(new OdiInstanceConfig(sourceMasterInfo, sourceWorkInfo));
sourceAuth = sourceOdiInstance.getSecurityManager().createAuthentication(sourceOdiUser, sourceOdiUserPwd.toCharArray());
sourceOdiInstance.getSecurityManager().setCurrentThreadAuthentication(sourceAuth);

println("Connected to ODI! Yay!")

When we execute the code (Run/Run As/Groovy Script), we can see that it connects successfully to our ODI instance. You may also decrease the ODI log level if you don’t wish so many details, but as for now, I’ll leave it as is.

post5

That’s it folks for our first post. Next one I’ll talk about how to get all ODI scenarios, load plans and folders and display it in a tree component, similarly on what we have in ODI Operator.

See ya!

DEVEPM on Oracle Magazine

Posted in ACE, Career, DEVEPM, ETL, ODI with tags , , on November 16, 2017 by radk00

Hi all!

Very quick post today! Rodrigo was interviewed by Oracle Magazine for the November/December issue. Check it out!

DEVEPM on Oracle Magazine

Thanks!

ODI KMs for HFM 11.1.2.4

Posted in 11.1.1.9.0, ACE, Configuration, DEVEPM, ETL, Hacking, HFM, Knowledge Models, ODI, ODI 11g, ODI Architecture, Uncategorized with tags , , , , , on March 3, 2017 by RZGiampaoli

Hi guys how are you? Today we are proud to announce that we are making available the ODI KMs for HFM 11.1.2.4.

—- EDITED on June/17 —-

We developed these KMs around 6 months ago, but we were waiting to release them together with an article that we wrote for Oracle.

Since OTN had some “Priority changes”, our article was postponed to later this year. As we had some people asking for these KMs we decide to release the KMs now and when the article is published we will let you guys know as well.

The article is live here! And if you guys are having errors with our KMs, please check our troubleshooting post here.

—- EDITED on June/17 —-

Prior to version 11.1.2.4, ODI could be easily used for HFM integration processes. ODI used its KMs with specific HFM drivers (HFMDriver.dll) provided by Oracle that were used to access and manipulate HFM applications. However, on HFM’s latest version, Oracle decided to remove its support for ODI, meaning that all HFM integrations would have to move from ODI to either manual iteration with HFM, usage of another integration tool (Like FDMEE) or create custom code using the new Java HFM API.

Since we didn’t want to re-write all our ODI environment and also none of the above options are robust enough, we decided to recreate the ODI KMs using Java HFM API. For these KMs to work we need to do two things: import them from ODI Java Net and do some setup in the ODI agent.

In the article we explain all options and how do we came up with this solution, but here we will not talk about it since we want you guys to read our article as well and we can’t use the content of the article here since we already signed an exclusivity agreement with Oracle.

The first part is easy and you just need to download the files from the link below

ODI KMS for HFM 11.1.2.4

The second one is more difficult. We need to make the new HFM Jars available to the ODI Agent and in order to do so we have two options:

Install the agent in the HFM machine OR copy the necessary jar files to the agent drivers folder (oracledi\agent\drivers).

If your architecture allows to have both HFM and ODI agent in the same server, then you may use this approach, which is very simple. The only thing to do is to change odiparams file (oracledi\agent\bin\odiparams.bat file in a standalone agent) and add the location of those three HFM jar files. Open odiparams.bat file and search for “ODI_ADDITIONAL_CLASSPATH”. On that setting, just set the location of the HFM jar files, as below (this is just an example. Please adjust the path accordingly to your environment):

set ODI_ADDITIONAL_CLASSPATH=%ODI_ADDITIONAL_CLASSPATH%;

“D:\Oracle\Middleware\EPMSystem11R1\common\jlib\11.1.2.0\epm_j2se.jar”;

“D:\Oracle\Middleware\EPMSystem11R1\common\jlib\11.1.2.0\epm_thrift.jar”;

“D:\Oracle\Middleware\EPMSystem11R1\common\jlib\11.1.2.0\epm_hfm_server.jar”

Save the file, restart the ODI agent and it is done

If you decide to go with the second option, we’ll provide a list of all the necessary jars (be prepared… it’s huge). In the article we explain how to identify all the necessary jar files in a systematic way but here this is not an option as explained before.

Search for all the Jars in the below list and copy all of them under oracledi\agent\drivers folder.

adm.jar
admaps.jar
admodbo.jar
ap.jar
ArtifactListing.jar
audit-client.jar
axiom-api-1.2.10.jar
axiom-impl-1.2.10.jar
axis-ant.jar
axis-jaxrpc-1.2.1.jar
axis.jar
axis2-adb-1.5.4.jar
axis2-kernel-1.5.4.jar
axis2-transport-http-1.5.4.jar
axis2-transport-local-1.5.4.jar
backport-util-concurrent.jar
broker-provider.jar
bsf.jar
castor-1.3.1-core.jar
castor-1.3.1.jar
com.bea.core.apache.commons.collections_3.2.0.jar
com.bea.core.apache.commons.net_1.0.0.0_1-4-1.jar
com.bea.core.apache.commons.pool_1.3.0.jar
com.bea.core.apache.log4j_1.2.13.jar
com.bea.core.apache.regexp_1.0.0.0_1-4.jar
com.bea.core.apache.xalan_2.7.0.jar
com.bea.core.apache.xml.serializer_2.7.0.jar
com.oracle.ws.orawsdl_1.4.0.0.jar
commons-cli-1.1.jar
commons-codec-1.4.jar
commons-compress-1.5.jar
commons-configuration-1.5.jar
commons-dbcp-1.4.0.jar
commons-discovery-0.4.jar
commons-el.jar
commons-fileupload-1.2.jar
commons-httpclient-3.1.jar
commons-io-1.4.jar
commons-lang-2.3.jar
commons-validator-1.3.1.jar
cpld.jar
css.jar
cssimportexport.jar
ctg.jar
ctg_custom.jar
dms.jar
epml.jar
epm_axis.jar
epm_hfm_web.jar
epm_j2se.jar
epm_jrf.jar
epm_lcm.jar
epm_misc.jar
epm_stellant.jar
epm_thrift.jar
essbaseplugin.jar
essbasestudioplugin.jar
ess_es_server.jar
ess_japi.jar
fm-actions.jar
fm-adm-driver.jar
fm-web-objectmodel.jar
fmcommon.jar
fmw_audit.jar
glassfish.jstl_1.2.0.1.jar
hssutil.jar
httpcore-4.0.jar
identitystore.jar
identityutils.jar
interop-sdk.jar
jacc-spi.jar
jakarta-commons.jar
javax.activation_1.1.jar
javax.mail_1.4.jar
javax.security.jacc_1.0.0.0_1-1.jar
jdom.jar
jmxspi.jar
jps-api.jar
jps-common.jar
jps-ee.jar
jps-internal.jar
jps-mbeans.jar
jps-unsupported-api.jar
jps-wls.jar
js.jar
json.jar
jsr173_1.0_api.jar
lcm-clu.jar
lcmclient.jar
LCMXMLBeans.jar
ldapbp.jar
ldapjclnt11.jar
libthrift-0.9.0.jar
log4j-1.2.14.jar
lucene-analyzers-1.9.1.jar
lucene-core-1.9.1.jar
lucene-spellchecker-1.9.1.jar
neethi-2.0.4.jar
ojdbc6dms.jar
ojdl.jar
opencsv-1.8.jar
oraclepki.jar
org.apache.commons.beanutils_1.8.3.jar
org.apache.commons.digester_1.8.jar
org.apache.commons.logging_1.1.1.jar
osdt_cert.jar
osdt_core.jar
osdt_xmlsec.jar
quartz.jar
registration_xmlBeans.jar
registry-api.jar
resolver.jar
saaj.jar
scheduler_ces.jar
servlet-api.jar
slf4j-api-1.5.8.jar
slf4j-log4j12-1.5.8.jar
sourceInfo.jar
stax-api-1.0.1.jar
wf_ces_utils.jar
wf_eng_agent.jar
wf_eng_api.jar
wf_eng_server.jar
wldb2.jar
wlpool.jar
wlsqlserver.jar
wsplugin.jar
xbean.jar
xmlparserv2.jar
xmlpublic.jar
xmlrpc-2.0.1.jar
XmlSchema-1.3.1.jar

Restart the ODI agent and it should be ready to execute any HFM Java code inside of ODI.

I know that this is a lot of jars and will take some time to find all of them but at least you’ll be able to upgrade you HFM and still use the same interfaces you have today in ODI to manage HFM (just remember to use the new data store objects reversed from the new RKM).

The KM usage is very similar to the old ones and we had the instructions in all its options so we’ll not explain then here (just in the article). The only important difference is on how to setup the “Cluster (Data Server)” information on Data Server (Physical Architecture). For the new HFM API, we need to inform two new settings: Oracle Home and Oracle Instance Paths. Those paths are related to the server where your HFM application is installed. These settings will be used internally in HFM API to figure out all HFM information related to that specific HFM instance.

Due to these two new settings and in order to continue to accommodate all connection information within a single place (ODI Topology), “Cluster (Data Server)” was overloaded to receive three settings instead of just one, separating them by colon. So now “Cluster (Data Server)” receives “dataServerName:oracleHomePath:oracleInstancePath” instead of just dataServerName.

data-server

Having those considerations in mind, it is just a matter to create a new Data Server and set the overloaded “Cluster (Data Server)” information and the user/password that ODI will use to access the HFM application. After that, we just need to create a Physical Schema with the name of the HFM application, a new Logical Schema and associate that to a context.

And that is it, you guys are ready to upgrade your HFM environment and still use your old ODI interface to maintain HFM. If you guys have any doubts/suggestions about the KMs please few free to contact us.

If you guys are having errors with our KMs, please check our troubleshooting post here.

I hope you guys enjoy these KMs. See you soon!

ODI 12c new features: Dimension and Cubes! Part 4 (Loading using Surrogate Keys)

Posted in Dimensions, ETL, ODI 12c, ODI Architecture, ODI Mapping, Oracle, Tips and Tricks with tags , , , , , on December 16, 2016 by RZGiampaoli

Hi guys how are you?

Today we’ll continue the dimension and cubes series (Part 1, Part2 and Part 3 here) and we’ll see how to load data using Surrogate keys.

After all the setting done in the last post, now the only thing left is to create the interfaces and map everything. For the Surrogate keys, the interface and the mapping are exactly the same as for no-surrogate version (as we can see in the previous posts) for both, dimensions and facts, what’s very nice.

times-surrogate-interfaceThe interesting here is what he does behind the scenes. In the no-surrogate version ODI created one mapping for each hierarchy and in the end it merged everything together inside a table.

no-surrogate-time-operatorFor the Surrogate key version, ODI also generates one mapping for each hierarchy but the main difference is that after each one he merges it witch the others. This happens because he needs to get the surrogate key for each level.

time-surrogate-operator

For each level ODI automatically generates an insert into that level stage table verifying if all the columns does not exists in the target table (He does that to decrease the amount of data for the merge step since merge would insert or update everything and would take more time than necessary).

After the stage table is loaded the next step is to merge the stage table to the target table, and for that ODI just create a “Merge”: when match he updates the descriptions or attributes and when doesn’t match it inserts the new rows with the sequences for the SK.

In the next level of the hierarchy ODI repeats the process but joining the Year with the Quarter. ODI will keep doing this for each level mapped until the last one, where instead of having a merge with matches and not matches, he just do a merge with Matches (since he know everything will already be there).

The results will be this:

time-surrogate-table-results

It’s nice that ODI already creates the dimension thinking in an aggregated fact since we can see that he has some rows just with the year, other with the year and quarters and the last one with all the information.

One thing to notice is that the PK is the same as the Month SK. This is because ODI is ready to create SCD type 2 (we’ll do another post to show how it works).

For the fact, the mapping will still be the same as the No-surrogate version and again the difference will be in the results.

fact-surrogate-interface

We can see that in the operator ODI does something really neat this time.

fact-surrogate-operator

MERGE INTO EPM_HPT_ODI_RUN.S_FACT FACT_SURROGATE1_FACT_SURROGATE USING
(SELECT TIME_SURROGATE_FACT_SURROGAT_1.MONTH_SK AS ID_TIME ,
PRODUCT_SURROGATE_FACT_SURRO_1.PRODUCT_SK AS ID_PRODUCTS ,
REGIONS_SURROGATE_FACT_SURRO_1.CITY_SK AS ID_REGIONS ,
SRC_ERP.SALES AS METRIC
FROM ((EPM_HPT_ODI_RUN.SRC_ERP SRC_ERP
LEFT OUTER JOIN
(SELECT TIME_SURROGATE_FACT_SURROGATE.ID_MONTH AS ID_MONTH ,
TIME_SURROGATE_FACT_SURROGATE.MONTH_SK AS MONTH_SK ,
TIME_SURROGATE_FACT_SURROGATE.TIME_PK AS TIME_PK
FROM EPM_HPT_ODI_RUN.S_TIME TIME_SURROGATE_FACT_SURROGATE
WHERE ((TIME_SURROGATE_FACT_SURROGATE.TIME_PK = TIME_SURROGATE_FACT_SURROGATE.MONTH_SK)
AND (TIME_SURROGATE_FACT_SURROGATE.MONTH_SK IS NOT NULL) )
) TIME_SURROGATE_FACT_SURROGAT_1
ON (SRC_ERP.ID_MONTH = TIME_SURROGATE_FACT_SURROGAT_1.ID_MONTH) )
LEFT OUTER JOIN
(SELECT PRODUCT_SURROGATE_FACT_SURROGA.ID_PRODUCT AS ID_PRODUCT ,
PRODUCT_SURROGATE_FACT_SURROGA.PRODUCT_SK AS PRODUCT_SK ,
PRODUCT_SURROGATE_FACT_SURROGA.PRODUCTS_PK AS PRODUCTS_PK
FROM EPM_HPT_ODI_RUN.S_PRODUCTS PRODUCT_SURROGATE_FACT_SURROGA
WHERE ((PRODUCT_SURROGATE_FACT_SURROGA.PRODUCTS_PK = PRODUCT_SURROGATE_FACT_SURROGA.PRODUCT_SK)
AND (PRODUCT_SURROGATE_FACT_SURROGA.PRODUCT_SK IS NOT NULL) )
) PRODUCT_SURROGATE_FACT_SURRO_1
ON (SRC_ERP.ID_PRODUCT = PRODUCT_SURROGATE_FACT_SURRO_1.ID_PRODUCT) )
LEFT OUTER JOIN
(SELECT REGIONS_SURROGATE_FACT_SURROGA.ID_CITY AS ID_CITY ,
REGIONS_SURROGATE_FACT_SURROGA.CITY_SK AS CITY_SK ,
REGIONS_SURROGATE_FACT_SURROGA.REGIONS_PK AS REGIONS_PK
FROM EPM_HPT_ODI_RUN.S_REGIONS REGIONS_SURROGATE_FACT_SURROGA
WHERE ((REGIONS_SURROGATE_FACT_SURROGA.REGIONS_PK = REGIONS_SURROGATE_FACT_SURROGA.CITY_SK)
AND (REGIONS_SURROGATE_FACT_SURROGA.CITY_SK IS NOT NULL) )
) REGIONS_SURROGATE_FACT_SURRO_1
ON (SRC_ERP.ID_CITY = REGIONS_SURROGATE_FACT_SURRO_1.ID_CITY)
) MERGE_SUBQUERY ON ( FACT_SURROGATE1_FACT_SURROGATE.ID_TIME = MERGE_SUBQUERY.ID_TIME AND FACT_SURROGATE1_FACT_SURROGATE.ID_PRODUCTS = MERGE_SUBQUERY.ID_PRODUCTS AND FACT_SURROGATE1_FACT_SURROGATE.ID_REGIONS = MERGE_SUBQUERY.ID_REGIONS )
WHEN NOT MATCHED THEN
INSERT
(
ID_TIME ,
ID_PRODUCTS ,
ID_REGIONS ,
METRIC
)
VALUES
(
MERGE_SUBQUERY.ID_TIME ,
MERGE_SUBQUERY.ID_PRODUCTS ,
MERGE_SUBQUERY.ID_REGIONS ,
MERGE_SUBQUERY.METRIC
)
WHEN MATCHED THEN
UPDATE SET METRIC = MERGE_SUBQUERY.METRIC

He automatically joins all our dimensions at level zero (since we have the dimensions in the higher levels for the aggregated fact) to get the surrogate key information and use it in the fact table. This is very nice because in large DWs we’ll have tons of dimensions, and map/join everything is very time consuming. The final results is this:

fact-surrgoate-sql-results

A perfect DW created using surrogate key, in other words, instead of having the dimensions PKs in the fact table we have the SKs (that ware generated by a sequence in the dimensions).

In resume, we think that if you going to create simple dimensions and simple facts (without surrogate key or SCD type 2) it’s still nice to use this new feature since it’s a nice way to document and standardize your DW, but if we measure by development time it’s not worthy since it’s very time consuming for simple DW.

Now, if you want to create a DW using surrogate keys or SCD type 2 we found this new feature extremely useful for both, documentation and standardizations and because is a lot faster than do manually.

Thanks and see you soon.

ODI 12c new features: Dimension and Cubes! Part 3 (Settings for Surrogate Keys)

Posted in Cubes, Dimensions, ETL, ODI, ODI 12c with tags , , on November 24, 2016 by radk00

Hi all! First of all, sorry for the delay. We really wished to have published the rest of this series earlier, but we are overwhelmed by projects, which keep us very busy. So let’s not waste time and go directly to what matters. I really recommend you to ready part 1 and part 2 (if you didn’t already) because we will assume some things here that were already done, so we don’t keep repeating ourselves.

Today’s post is how to setup ODI dimension objects to work with Surrogate Keys. In the first post we said that there was a bug in ODI 12c that was preventing us to create dimensions with SKs. We opened an SR with Oracle and it turned out that it was not a bug, but it was some missing configurations that were not enabling us to create the objects in the right way. So, apologies to Oracle 🙂 I hope this post may explain those little specific setups, so other people does not fall on the same mistakes that we did when we tried to create these dimensions.

First let’s begin with the DB script for this example. Our source tables will remain the same as the previous example (SRC_* tables). Our stage tables will be different and we will use the STG*S tables for this example. The final dimension/fact tables will be the S* tables found below.

surrogate-script

1

Also, please create the following Native Sequences that will be used to create our SK values:

1-1

1-2

Let’s talk a little about the SK setup requirements. There are some key points that were not clear in Oracle’s documentation and that’s why we were not able to complete it successfully. After talking to Oracle Support, we got the following key requirements to make SK setup to work:

  • Each level of the dimension must have its own Natural Key and Surrogate Key columns. The SK column MUST be different to the PK of the dimension (this is very important. This was the wrong setup that we were trying to do and it was failing). This allows ODI to manage SCD type 2 changes that occur across a hierarchy (while not applicable to a Time dimension it still needs to be setup that way);
  • The dimension MUST have a Primary key defined on it;
  • Each staging table for each level MUST include all the attributes of any level above it in the hierarchy (MONTH must have all attributes of QUARTER and YEAR). The easiest way to accomplish this is to just create the staging tables to have all the attributes of the dimension. (But you may create only the needed ones. The scripts in this post only contain the necessary attributes);

Let’s get as example S_TIME table. It contains the following columns:

2

S_TIME has three levels and for each level we are going to have:

  • One attribute for each member name (YEAR, QUARTER and MONTH);
  • One ID (that will be setup as Natural Keys) for each member level (ID_YEAR, ID_QUARTER and ID_MONTH);
  • One SK for each member level (YEAR_SK, QUARTER_SK and MONTH_SK);
  • And finally the tables PK – TIME_PK;

After you run this ODI component (in our fourth post), you will notice that some information gets replicated on IDs and SKs. It may seem odd for you, but it is actually correct, since those objects are prepared to handle SCD2 type of data, so even if you don’t use it right now, you’ll need to setup them this way on your ODI dimensions (the good thing is that, if you decide later on to use SCD2, then the setup will be already done for you).

Now let’s create the TIME_SURROGATE dimension as below:

3

For level Month, do the following setup:

4

Quarter:

5

Year:

6

On Hierarchies tab, do the following setup:

7

For the other two dimensions, the process is very similar, so I’ll not add screenshots here. For the Cube setting, it is exactly as we did for the cube in the first post:

8

9

And that’s it, we are ready to load those components using Mappings. Our fourth post will show you the differences when using SK models and the benefits that it may bring to you.

See you soon!