Archive for the Gems Category

ODI Hidden Gems – Unique temporary object names

Posted in Gems, ODI, Tips and Tricks with tags , , on March 11, 2021 by Rodrigo Radtke de Souza

Hi all, I was not going to write about this one because I thought that this “hidden” gem was already known to every single ODI 12 developer out there, but I still get questions on why sometimes some specific data loads fails when they run in parallel, and they work fine when they run in serial. Most of the times, those are related to how ODI handles the temporary objects that it creates to do ETL (like C$, I$, E$ tables).

Let us see one example, which is the default in ODI. I have created one very simple mapping that has one source and one target table.

This mapping is loading data from two databases that resides in different data servers, so it will need to create a C$ table to be able to transfer the data. If we look at ODI Operator, we will notice the following:

The C$ table that it created is named as C$_0SECTIONTYPE. By default, ODI will create this name based on the source component that the data was generated from, so in this case it was a table called SECTIONTYPE. The “0” in front of it is an incremental number that would increase if you had another source with the same name in the mapping. For example, if you had SECTIONTYPE mapped twice as source tables, one would be loaded as C$_0SECTIONTYPE and the other C$_1SECTIONTYPE. ODI does that so we do not have a clash between names within the same ODI mapping.

However, what would happen if you tried to run the same mapping or another mapping that also contains SECTIONTYPE as source at the same time? As you may imagine, one mapping would interfere in the other, since both C$ tables would be called C$_0SECTIONTYPE and both mappings would be trying to load/read/drop it at the same time, which would cause a failure (in a good scenario) or wrong data (in a bad scenario).

To avoid this kind of issue to happen, ODI 10/11 developers were very creative in the past and would create some Java variables and some tweaks in some ODI KMs to make the temp table names dynamic. However, ODI 12 introduces something way simpler to handle this kind of situation, however, its not default and it is kind of “hidden”. If you go back to the mapping, click on the Physical tab, and scroll all the way down. You will notice a check box that says “Use Unique Temporary Object Names” that is unchecked. If you check this one and run it again, you will see the difference.

ODI now created a table named C$_0SECTIONTYPEAIHLNQMAVPK7Q1FR66UM225DF3, which is totally unique, and it will never clash with another mapping running in parallel. But then, another question arises: should I go to each mapping to check this option, if I want all to be unique? Well, the answer is no. There is another “hidden” gem that you can use.

Go to ODI Topology and double click your ODI agent, then go to Properties.

Luckily, you may enable the use of unique temporary objects at an ODI agent level, so you do not need to go back to each mapping and changing them.

That is it for today. See ya!

ODI Hidden Gems – Static validations

Posted in Gems, ODI, Tips and Tricks with tags , , on December 20, 2019 by Rodrigo Radtke de Souza

Hi all!

This post is based in one question that I answered in the Data Integration community forum some time back. This feature is powerful, but it is also somehow “hidden”. The question was like this (I have edited it for the purpose of this post):

I have one mapping that will load one target table (TARGET) from two different sources (SOURCE_A and SOURCE_B). Target table TARGET contains ITEM_KEY, ATT1, ATT2, ATT3 and so on. One Source (SOURCE_A) contains ITEM_ID, ATT1, ATT2, ATT3 values and we are using a look up to a dimension table (T_ITEM) to get ITEM_KEY for the ITEM_ID. Second source (SOURCE_B) contains CC_ID, ATT1, ATT2, ATT3 and there is a reference table (T_CC_ITEM_REF) which contains the mapping between CC_ID and ITEM_ID that we use to look up to the dimension table (T_ITEM) to get ITEM_KEY for the ITEM_ID.

Validating the first source is straight forward as we have declared a constraint in CKM to log foreign key error records which got logged as ‘Join Error between TARGET and T_ITEM’ in E$. For the second source, there is a “middle” table in the mapping, so we must join SOURCE_B to T_CC_ITEM_REF (reference table) and to T_ITEM (dimension table). I can still create a constraint and log foreign key errors between TARGET and T_ITEM, but is there a way that I can also be more specific and capture any join error between T_CC_ITEM_REF (reference table) & T_ITEM (dimension table). E.g. if T_CC_ITEM_REF is missing any mapping record or if any mapping is mapped to an inexistent T_ITEM row?

This is very good example of “indirect join” validation, where you want to validate some source data that is used in your mappings, but these tables are not directly associated with your target table. In these cases you cannot directly validate them in a regular mapping, but ODI has a very nice feature called Static validation, which allows you to run any validation in any data model at any time, so you could catch all those errors before trying to load your target table.

If you go to ODI models and expand any of them, you will see that you may add “Constraints” to it. Generally, we add those constraints to the target table, so data gets validated during the load data flow, before hitting the actual target table. I said “generally” it is done in the target because we have situations (like the one that we are talking now) where we want to validate the source/mapping tables even before we try to load the targets.

Before getting directly on how to solve this issue, lets step back and briefly see some options that ODI has to offer regarding data validation. If you go to ODI models and expand any of them, you will see that you may add “Constraints” to it.

1

The first one is a “Key”, which resembles a “Primary Key”. You will add all columns that belongs to your table primary key and ODI will make sure to log all records that are not unique based on this key.

The second one is “Reference”, which resembles a “Foreign Key”. You will add the columns and tables names that belongs to a foreign key relationship and ODI will automatically log all records that have no reference between that relationship. If we go back to the above question, this works fine to validate our TARGET table against the T_ITEM (dimension) because there is a direct link between them. ODI will log all source rows that does not have a valid “join” to the T_ITEM table before loading it to the target (so the target has only “valid” records regarding that FK).

Third one is called “Condition” and it is the most flexible of all ODI constraints. Basically, it is a free form text where you may add any kind of SQL statement which you want to check. You just need to remember that you always want to write down a statement that will check for TRUE values, as for example, you want to test if a column value exists, “is in” another table or if the values are greater or lesser than a specific value. When you execute your mapping, you will see that ODI will “negate” your condition adding a NOT before it, so it will check for all the records that are not TRUE (in other words, FALSE) and logging them at the E$ table. Pretty neat stuff.

Going back to our example, let’s divide our problem in two pieces. First let’s validate if T_CC_ITEM_REF contains any kind of bad data, meaning that we will check if it contains any mapping row that references a non-existing T_ITEM row. This can be achieved by going to T_CC_ITEM_REF and create a New Reference.

2

There you point the “Parent Table” (in this case, T_ITEM):

3

In attributes, add the two columns that are used in the join condition:

4

Save it. Go to the parent Model of this datastore and check its Control tab. You will need to select which CKM will be used to run the Static validation (you cannot leave it blank).

5

Create a new package and “drag and drop” the model to the package. It will look like this:

6

This icon indicates that, once you run the package, ODI will run a Static validation on this datastore using the CKM that we set in its parent Model. Once we run it, this is what we get in Operator:

7

The warning sign on the scenario execution tell us that some error was logged in the E$ table. When we go to check it, we can see following:

8

It means that ITEM_ID 50 does not exist in T_ITEM, so any source record with CC_ID equals to 5 would be dropped due to an invalid mapping row.

Now, what if we want to check if all CC_IDs from the source has a valid mapping record? This is where “Condition” constraint comes in handy, since it is very flexible and allow us to virtually write any kind of SQL logic in it. Let’s go to SOURCE_B and create a Condition to it:

9

Add a name to it and select a type. An Oracle Data Integrator Condition is a condition that exists only in the model and does not exist in the database. A Database Condition is a condition that is defined in the database and has been reverse-engineered. In our case, let’s pick ODI Condition. Write the SQL statement that you want to be true. In this case, I want all source CC_ID columns to be not null and that also exists in the join between my mapping/dimension tables. Another cool thing about conditions is that you may write custom messages for it, so it gets clearer to the users what that validation means.

10

Similarly, to the previous example, drag and drop the model to the package and run it. This is what you will get:

11

12

The error is telling us that there is a source record (CC_ID 3) that does not have a mapping record on T_CC_ITEM_REF. You may add both Static validations on your package, before you load your target and have a complete picture on what data is missing in your data flow.

Hope you have liked the post. See ya!

ODI Hidden Gems – Target Load Order

Posted in Gems, ODI 12c, Tips and Tricks with tags , , on October 14, 2019 by Rodrigo Radtke de Souza

Hi all!

Today’s gem is related to a very nice surprise that all ODI developers had when they were migrating from ODI 11 to ODI 12, which is the ability to load multiple target tables with the same ODI mapping object. You may have a very large mapping and “split” (with Split object) the result set into multiple targets, or you may just copy the same result to different target tables, mapping different fields in each of the targets. This post is not about what you can do when loading multiple targets (which is kind a lot due to this feature in ODI 12c), but how to control the order which those target tables are loaded.

Let’s start with the following example:

1

If you execute this mapping, this is what it will look like in Operator:

2

You can see that it loaded the tables in an order that was decided by ODI, probably in the same order that you dragged and dropped the models into the mapping. Now imagine that you want to have control over the order and need to load Table B before table A. The classical example where you would use that is when you retrieve a source dataset and you want to derive both the dimension and fact information out of it. In this case, you would want to load the dimension table first and the fact table second.

Luckily, ODI allow us to change the order, but its not too intuitive.  In the Logical tab, you need to click in any area that does not contain an object (any white area will do). This will display the “Target Load Order” option:

8

Click on the “gear” icon in the far right and a menu will popup:

4

Now you may configure it to have any order that you want, let’s say table B, C and then A. If you save and run the mapping now, this is what you get:

5

6

That’s it folks! See ya!