Archive for Tips and Tricks

Building dynamic ODI code using Java variables – Table to File example

Posted in ODI with tags , on March 23, 2023 by Rodrigo Radtke de Souza

Written on March 23, 2023 by Rodrigo Radtke de Souza

Hi all. Today’s post is an “enhancement” of one post that I did a long time ago regarding using Oracle’s metadata to build dynamic code. I say it is an enhancement because that technique would use the source/target command tabs and it has a 4000 character limitation. So, if you are trying to build some metadata information that crosses this limit, it will fail.

You should be good using source/target command technique in 90% of your data loads, but there are a few exceptions that we need to go beyond 4000 characters. One of these examples is when you are trying to build dynamic code to generate text file loads in ODI. The example that I’ll show here is about dumping the data from any Oracle table to a text file dynamically, using only one ODI procedure.

This is what we will build. I’ll go over each step in detail:

Basically, it is an ODI procedure that will receive a table name as a parameter and it will write that table data into a text file using pipe “|” as delimiters. This is a very small but useful procedure that you may use when you need to create fast text files from tables/views without going through all the work to create ODI mappings for it.You will need two Logical Schemas, one for the source table and another one for the target file. In this example I used LOCAL_SOURCE that points to an Oracle database and OUTPUT_FILE that points to a folder where the text file will be created.First step is to get the file metadata. As I said, this will be done in a different way now. Instead of using source/target tab commands, we will use Java BeanShell to retrieve this information. This is because ODI creates a very lengthy metadata for each column and the total amount of metadata often crosses the 4000 characters limit. Just as an example, each column in a file text load looks similar to this:

"CRDWG_COLSNP$CRCOL_NAME="+metaDataSrc.getColumnName(i) + "SNP$CRTYPE_NAME=STRINGSNP$CRORDER="+i+"SNP$CRLENGTH=4000SNP$CRPRECISION=4000SNP$CRACTION_ON_ERROR=NULLSNP$"

Here is the full code for the “Get File Metadata” step:

<@
import java.sql.*;

String table_name = "#DWCLOUD.L_TABLE_NAME";
String folder_path = odiRef.getSchemaName("OUTPUT_FILE","D")+"/"+table_name;
String folder_path_txt = folder_path+".txt";
String table_schema = odiRef.getSchemaName("LOCAL_SOURCE","D");
String table_metadata = "/*$$SNPS_START_KEYSNP$CRDWG_TABLESNP$CRTABLE_NAME="+table_name+"SNP$CRLOAD_FILE="+folder_path_txt+"SNP$CRFILE_FORMAT=DSNP$CRFILE_SEP_FIELD=0x007cSNP$CRFILE_SEP_LINE=0x000D0x000ASNP$CRFILE_FIRST_ROW=1SNP$CRFILE_ENC_FIELD=SNP$CRFILE_DEC_SEP=SNP$CRSNP$";
String table_header = "";
String table_header_parameter = "";

Connection ConSrc = DriverManager.getConnection("<%=odiRef.getInfo("SRC_JAVA_URL")%>","<%=odiRef.getInfo("SRC_USER_NAME")%>",odiRef.getInfo("SRC_PASS"));
Statement stmtSrc = ConSrc.createStatement();
ResultSet querySrc =  stmtSrc.executeQuery("select /*+ parallel(64) */ * from "+ table_schema +"." + table_name + " where 1=2");
ResultSetMetaData metaDataSrc = querySrc.getMetaData();

int count = metaDataSrc.getColumnCount();
for (int i=1; i<=count; i++)
  {
	table_metadata = table_metadata + "CRDWG_COLSNP$CRCOL_NAME="+metaDataSrc.getColumnName(i) + "SNP$CRTYPE_NAME=STRINGSNP$CRORDER="+i+"SNP$CRLENGTH=4000SNP$CRPRECISION=4000SNP$CRACTION_ON_ERROR=NULLSNP$";
	table_header = table_header+metaDataSrc.getColumnName(i);
	table_header_parameter = table_header_parameter+":"+metaDataSrc.getColumnName(i);  
    	 
	if (i==count)
	{
  	table_metadata=table_metadata+"CR$$SNPS_END_KEY*/";
	} else
	{   
  	table_metadata=table_metadata+"CRSNP$";
  	table_header = table_header + ",";
  	table_header_parameter = table_header_parameter + ",";    
	}
  }

ConSrc.close();
@>

How it works? It uses a variable (that you will use in an ODI package later on), to figure out which table it will dump into a file. It actually can be other objects that you can query, like a view. Views are actually great to be used here since you may create any kind of business logic and filters on the view and dump the data to a file when needed. It also uses both logical schemas (source and target) to figure out where to read the source data and place the target file.

It then creates a connection object using the source connection that you set in the ODI procedure (please pay close attention to each source/target logical schema in the ODI procedure) and it runs a query where 1=2 just to retrieve the table/view metadata.

Having this metadata in the result set, it loops through it and starts to “build” the dynamic code for the header and the columns. It does some special conditions when it reaches the last column and then it closes the connection.

It may sound complicated, but it’s fairly simple. In less than 40 lines of code we are able to retrieve all metadata to dump any table’s data to a flat file in a dynamic way. The other steps are now very straight forward.

“Create File” step will be as simple as this:

create table "<@=folder_path@>"
<@=table_metadata@>

Same for “Truncate File”:

truncate table "<@=folder_path@>"
<@=table_metadata@>

“Create Header”:

create header (<@=table_header@>)
<@=table_metadata@>

“Insert Rows” is the only one that needs two steps. In Source Command we have:

select     
    /*+ parallel(64) */ <@=table_header@>
from    <@=table_schema@>.<@=table_name@>
where    
    (1=1)    

This one is used to read the source table. In Target Command we have the insert statement to the file object:

insert into "<@=folder_path@>"
(
	<@=table_header@>
)
values 
(
	<@=table_header_parameter@>
)
<@=table_metadata@>
  

Add this procedure in an ODI package, alongside with the ODI variable for the table name:

Add the table/view object that you want to dump the records. In this example, I used a table named SAMPLE_DATA. When I executed the package, I have the following:

If I go to the folder path defined in OUTPUT_FILE logical schema, I can see the data file created there (they are all fake/random rows, not real people information):

I hope you liked this post. See you soon!

ODI Hidden Gems – Unique temporary object names

Posted in Gems, ODI, Tips and Tricks with tags , , on March 11, 2021 by Rodrigo Radtke de Souza

Hi all, I was not going to write about this one because I thought that this “hidden” gem was already known to every single ODI 12 developer out there, but I still get questions on why sometimes some specific data loads fails when they run in parallel, and they work fine when they run in serial. Most of the times, those are related to how ODI handles the temporary objects that it creates to do ETL (like C$, I$, E$ tables).

Let us see one example, which is the default in ODI. I have created one very simple mapping that has one source and one target table.

This mapping is loading data from two databases that resides in different data servers, so it will need to create a C$ table to be able to transfer the data. If we look at ODI Operator, we will notice the following:

The C$ table that it created is named as C$_0SECTIONTYPE. By default, ODI will create this name based on the source component that the data was generated from, so in this case it was a table called SECTIONTYPE. The “0” in front of it is an incremental number that would increase if you had another source with the same name in the mapping. For example, if you had SECTIONTYPE mapped twice as source tables, one would be loaded as C$_0SECTIONTYPE and the other C$_1SECTIONTYPE. ODI does that so we do not have a clash between names within the same ODI mapping.

However, what would happen if you tried to run the same mapping or another mapping that also contains SECTIONTYPE as source at the same time? As you may imagine, one mapping would interfere in the other, since both C$ tables would be called C$_0SECTIONTYPE and both mappings would be trying to load/read/drop it at the same time, which would cause a failure (in a good scenario) or wrong data (in a bad scenario).

To avoid this kind of issue to happen, ODI 10/11 developers were very creative in the past and would create some Java variables and some tweaks in some ODI KMs to make the temp table names dynamic. However, ODI 12 introduces something way simpler to handle this kind of situation, however, its not default and it is kind of “hidden”. If you go back to the mapping, click on the Physical tab, and scroll all the way down. You will notice a check box that says “Use Unique Temporary Object Names” that is unchecked. If you check this one and run it again, you will see the difference.

ODI now created a table named C$_0SECTIONTYPEAIHLNQMAVPK7Q1FR66UM225DF3, which is totally unique, and it will never clash with another mapping running in parallel. But then, another question arises: should I go to each mapping to check this option, if I want all to be unique? Well, the answer is no. There is another “hidden” gem that you can use.

Go to ODI Topology and double click your ODI agent, then go to Properties.

Luckily, you may enable the use of unique temporary objects at an ODI agent level, so you do not need to go back to each mapping and changing them.

That is it for today. See ya!

Generate JSON objects with ODI in a very easy way

Posted in ODI, Tips and Tricks with tags , on November 9, 2020 by Rodrigo Radtke de Souza

Hi all. I came across a requirement to create Json objects from a set of Oracle tables. We have ODI, so it was natural that the solution would be created there. However, working with JSON objects in ODI is not that easy. For those who already worked with “Complex File” technology knows what I am talking about. Too much setup and any misconfiguration will cause an error that is generally very hard to troubleshoot.

In the past, I did several mappings that would read from Complex Files (including Json), but this time it was an outbound process, so I would need to create the Json objects, not read from them. I tried to search on the internet, but nothing was clear. I was not sure if the Complex File technology would work to create outbound files and I was not in the mood to play with XSD files this time, so I needed to find some other solution.

Talking to a friend/co-worker of mine, he asked me if we could not leverage the Oracle’s JSON_OBJECT function somehow in ODI. First, I did not know what that function was about, so I researched about it:

The SQL/JSON function JSON_OBJECT takes as its input one or more property key-value pairs. It returns a JSON object that contains an object member for each of those key-value pairs.

It looked very interesting. So, I give it a try in SQL Developer. It worked very well:

Basically, you may add any number of columns to the function and it will create a valid JSON object out of it. Pretty neat, right? Now it is just a matter of how to add it to ODI, which thankfully is very easy. First, I created a model in the File technology:

It contains only one column, with 4000 characters. Then I created a mapping and mapped all the required columns to the JSON_OBJECT function, like this:

That’s it. Pretty simple. When we run the mapping, we have an outbound file like this:

Now you may send the outbound file to any application that needs to consume JSON objects. One thing that you may need to consider is that you will probably get into trouble if your Json objects ends up being larger then 4000 characters each. I didn’t test it, but either Oracle or ODI will probably complain about it.

I hope you have liked it. This is not a “traditional” way to create Json objects, but for sure it is the easiest one! See you next time!

Oracle SQL for EPM Tips and Tricks S01EP14

Posted in ODI, Oracle, Oracle Database, Performance, SQL, Tips and Tricks with tags , , , on October 15, 2020 by RZGiampaoli

Hey guys how are you? Continuing our SQL series (S01EP13), today I’ll share a very hand little query that I use very often for check data duplication. In fact, this would be an upgrade version of ODI’s pk check.

An upgrade version because in ODI, if you enable PK check, if he finds duplication, he eliminate both data. This code I’ll show you, you would choose if you want to keep the last created duplication or the oldest one, but only one will be eliminated.

I have a test table with this values:

If I want to check for duplicate PK, I can just run this query here:

The Idea here is, we have 2 queries. The first one will check if the ROWID it has is bigger or smaller (you choice) than the MIN or MAX ROWID (depending of your previous choice) than the second sub query by any joins you want to check.

In this case, we wanted to check only if the PK column had duplicated values, but we could check any other column by just replace it in the join. In fact, we could have any amount of columns in the join and that would check if there’s any duplications in all columns you inserted there.

Then you can select the first using > and MIN or the last by using < and MAX as well you can select what column you want to check in the where clause.

One important thing to mention is that this query is meant to work as a delete because it’ll keep what was not in the select. What I mean is, if you have more than one duplication, it’ll bring, in this case, all the rows that has the ROWID > then the one selected in the first query:

Then if I have multiple duplications, the query will return everything that needs to be deleted and the only one remaining was the first one inserted (3, Chuck, Giampaoli).

I hope you enjoy this little trick and see you soon.

Oracle Always Free cloud offering (Part 2)

Posted in ACE, Autonomous Database, Cloud, Data Warehouse, Oracle, Oracle Database with tags , , , , , on May 18, 2020 by RZGiampaoli

Hey guys how are you? Today I’ll continue to talk about the Oracle Always free cloud offering and I’ll try to summarize what you can do after your account is set up. If you want to know how to setup you account you can find it HERE.

After you receive an email saying everything is set you can login in your account and you’ll see a screen like this:

This is the main dashboard. Here’s where you’ll create your Database, your VM’s, convert your account to paid, manage your account, ask for help, etc… Let’s start with the main dashboard:

  • (1) Quick Actions: Here you’ll find the most important links as quick actions.
    • (2)Compute: This is where you can create a VM to be used with your databases. You can use it to install tools and develop whatever you want inside the your environment.
    • (3)Networking: Here’s where you set up your cloud network. This is the first step you must do to ensure your VM and databases will be in the same network and reaching each other.
    • (4)Autonomous Transaction Processing: This is where you create a transaction database.
    • (5)Autonomous Data Warehouse: This is where you can create your Data Warehouse database.
    • (6)Search: A quick way to view all your resources.
  • (7)Account Center: Here’s a quick place to manage your account and see how many credits you have and billing information
  • (8)Main Menu: This is the main menu where you have access to everything that you can do inside your Cloud.
  • (9)Top Bar: Where you can change regions, in case you have more than one region, access the Cloud Shell (for OS commands), see the help, ask for help in the chat, change language and see your profile.
  • (10)Start Exploring: Here’s a place where you can find articles to help you start setting up your environment.
  • (11)What’s new: And finally here’s where you can see news about Oracle cloud, like releases and things that will be added.

One important thing to add here is that before you add anything or create anything, look for the “Always Free Eligible” logo or description to be sure you’ll not buying anything by mistake. Now about the main menu:

  • Core Infrastructure: Here’s where you can set your VM’s, networks and storage options.
  • Database: Here’s where you can Set your databases options, backups and Servers (VM or Bare metal).
  • Data and AI: Here’s where you can set your Big Data and AI environment.
  • Solution and Platform: Here’s where you can set your Analytics cloud services, Integrations, monitoring and marketplace.
  • More Oracle Cloud Services: Here’s where you have other cloud services.
  • Governance and Administration: And here is where you can administrate your environment like provisioning security, Account Management, Identity and Governance.

As you can see there’s a lot that can be done, but we’ll concentrate in the “Always Free” content, but the following list summarizes the Oracle Cloud Always Free-eligible resources that you can provision in your tenancy:

  • Compute (up to two instances)
  • Autonomous Database (up to two database instances)
  • Load Balancing (one load balancer)
  • Block Volume (up to 100 GB total storage)
  • Object Storage (up to 20 GiB)
  • Vault (up to 20 keys and up to 150 secrets)

In the next post we’ll setup our environment. See you soon guys.

Fragmented and Aggregated tables in OBIEE using ODI Part 2/5: Managing the partitions of Fragmented/Aggregated tables

Posted in OBIEE, ODI, ODI Architecture, Oracle, Oracle Database, Performance, SQL, Tips and Tricks with tags , , , on February 10, 2020 by RZGiampaoli

Hey guys, how are you?

Continuing the series Fragmented and Aggregated tables in OBIEE and today we are talking about how to Populating the Fragmented tables using ODI.

Just to make easier for you to navigate in this series, here’s the parts of it:

Creating the tables: I’ll explain the advantages and how you can create the fragmented tables and also how to further improve performance using partitioning and sub partitioning.

Managing the partitions of the Fragmented/Aggregated tables: Here you’ll see how to manage all partitions using ODI.

Populating the Fragmented tables using ODI: Here you’ll find how to create generic components to load all fragmented tables we need to.

Populating the Aggregated tables using ODI: Here you’ll find how to create generic components to load all Aggregated tables we need to.

Setting the OBIEE Repository: Finally, we’ll going to setting up the OBIEE repository to make use of all tables.

In my previous post we design 18 tables and its partitions and sub-partitions. Now what we need to do is to make ODI manage these partitions for us. As I said before, the Sub partitions will be static and if we need to create one more sub-partitions, we’ll be doing that manually. That’s always the best choice because manage sub-partitions add a layer of complicity that is not needed.

With that said, if you need to manage the sub-partitions, you can use the exactly same approach I’ll show here just adapting to the needs of the sub-partition’s creation and sub-partition template update.

Let’s start analyzing what needs to be done. For each table we need to issue this command to the database:

For List partitions:

               ALTER TABLE table_name ADD PARTITION partition_name VALUES (value[, value]…);

For Range Partitions:

               ALTER TABLE table_name ADD PARTITION partition_name VALUES LESS THAN (value[, value]…);

Sounds hard but in fact is very simple and it’ll going to work in any SQL you have. To do that this we will use a procedure with our ALTER TABLE statement in the Command on target tab and a SELECT in the command on source to pass the information we need to create the partitions like the table names, the partition name and the partition values.

For each row that the SELECT on source returns, the statement in the Command on target will be executed, and we can pass the values from the SELECT in the Command on Source to change the behavior of the statement in the Command on target.

This is something very important that we need to understand about ODI. ODI It’s not an ETL tool but a code generator tool that orchestrate Database and a lot of other tools if you will.

What I mean by that is, until ODI send the code for the database to execute, everything that ODI does is to generate code. If we use the ODI API it’ll replace it before send to the database as well to replaces the variable with their values, meaning a variable like this #SQL on the Command on Target tab could contain an entire dynamically generate INSERT that ODI would replace before send it to the Database.

And this is why ODI is so powerful, because we can manipulate string and make everything dynamic.

Then our query in the source should return something like this:

The query will change depending of your design of the application then I don’t think sharing my query will going to help. One tip I can give you is always compare the name of the partition with the content of the column PARTITION_NAME on ALL_TAB_PARTITIONS view to see if the partition you want to create already exists to prevent an error in ODI, in fact you have 2 choices, either do this or flag the step as Ignore Error to end the step with a warning. I prefer to compare and end the step without warnings, then I compare.

Another thing to notice is that for FY20 Feb, my PARTITION_VALUE (MAX_PERIOD) is the 1 day after the last day of the month. It’s hard to see here because the calendar used by the client is all different but if you have the normal calendar and the mask you have for you PERIOD_ID is YYYYMMDD, for 2020/Feb you partition ID needs to be 20200230 (Feb ends 02/29/2020).

That’s because this is a RANGE partition, and for RANGE partition we define the partition value as VALUES LESS THAN, what means that everything under 20200230 will be FEB unless we have a smaller partition (JAN) below it. But for example, if you smallest partition is 2018/Jan, everything below that period will be inserted in the 2018/Jan partition, then you need be careful with when you’ll start to create partitions.

Another important thing to mention is that you cannot insert a partition smaller than an existing partition, just bigger. That means, if you start with 2018/Jan, it’s impossible to create a 2017 partition, the only way is to drop the table and create the table starting by 2017. A RANGE partition needs to be sequential and always growing.

Having these values returning from the SELECT on the Command on source tab, the only thing that is left is to use these variables in the statement in the command on Target tab like this:

ALTER TABLE #TABLE_NM ADD PARTITION PARTITION_NM VALUES LESS THAN (#MAX_PERIOD);

Pay attention on the values of the partitions. In this case the values are numeric, that is why I’m not using quotes. If the values were String, we need to enclosure the variable with quotes.

And that’s it, this is all we need to do to manage partitions using ODI. Because the SELECT on the source returns All tables with their partitions name and values and we pass to the TARGET statement this information, for each row in the source the target will be looped, creating the partitions you need for each table.

We are done for the fragmented tables. Now we need to manage the partitions for the Aggregated table. The approach is the exactly the same, the only thing that will change is the content of the query from the Command on Source tab and that the PARTITION now will be a LIST Partition instead of a RANGE partition.

We could have done just one query to return all values including if that was a RANGE or a LIST partition, but for the sake of simplicity is always better to split the scenarios to not over complicate something simple.

For the month level we need the query in the Source to return:

And the statement in the Command on Target should be like this:

ALTER TABLE #TABLE_NM ADD PARTITION PARTITION_NM VALUES (#MAX_PERIOD);

That’s all that needs to be done. For the quarter level the results are similar:

And the statement in the Command on Target should be like this:

ALTER TABLE #TABLE_NM ADD PARTITION PARTITION_NM VALUES (#MAX_PERIOD);

As you can see, it’s very simple to manipulate and manage the database with ODI, we just need to create a query with the information we want in the source, and replace the target statement with that information, that’s all we need to do.

Another important thing that we need in our procedure, in case you want to go with the truncate/insert approach, is to truncate the partitions before load. This is also managed by ODI and also works in the exactly same way as the CREATE PARTITIONS but a little bit simpler since we don’t need to have the partitions values:

And the statement in the Command on Target should be like this:

ALTER TABLE #TABLE_NM TRUNCATE PARTITION PARTITION_NM;

For month level:

And the statement in the Command on Target should be like this:

ALTER TABLE #TABLE_NM TRUNCATE PARTITION PARTITION_NM;

And for the Quarter Level:

And the statement in the Command on Target should be like this:

ALTER TABLE #TABLE_NM TRUNCATE PARTITION PARTITION_NM;

Also, could be done with just one query but again, simplicity works always better than complexity.

This is all we need to start loading data. One last thing we could do, if is you case, is to create a purge process to drop old partitions. If your business requires 5 years of data, it’s a good idea to drop older partitions. The approach is exactly the same

In fact, the results of the Command on source are exactly the same as the ones we need for the TRUNCATE PARTITION steps, the only difference is that you need to have in place a logic to find the old partitions. Other than that, the only thing that changes are the Statement on the Target that should be:

ALTER TABLE #TABLE_NM DROP PARTITION PARTITION_NM;

And that’s it, we just create a procedure to manage the partitions for all our 18 tables including create, truncate and drop old partitions.

In the next part of this series we’ll going to talk about populating the Fragmented tables.

I hope you find this helpful and see you soon.

ODI Hidden Gems – Static validations

Posted in Gems, ODI, Tips and Tricks with tags , , on December 20, 2019 by Rodrigo Radtke de Souza

Hi all!

This post is based in one question that I answered in the Data Integration community forum some time back. This feature is powerful, but it is also somehow “hidden”. The question was like this (I have edited it for the purpose of this post):

I have one mapping that will load one target table (TARGET) from two different sources (SOURCE_A and SOURCE_B). Target table TARGET contains ITEM_KEY, ATT1, ATT2, ATT3 and so on. One Source (SOURCE_A) contains ITEM_ID, ATT1, ATT2, ATT3 values and we are using a look up to a dimension table (T_ITEM) to get ITEM_KEY for the ITEM_ID. Second source (SOURCE_B) contains CC_ID, ATT1, ATT2, ATT3 and there is a reference table (T_CC_ITEM_REF) which contains the mapping between CC_ID and ITEM_ID that we use to look up to the dimension table (T_ITEM) to get ITEM_KEY for the ITEM_ID.

Validating the first source is straight forward as we have declared a constraint in CKM to log foreign key error records which got logged as ‘Join Error between TARGET and T_ITEM’ in E$. For the second source, there is a “middle” table in the mapping, so we must join SOURCE_B to T_CC_ITEM_REF (reference table) and to T_ITEM (dimension table). I can still create a constraint and log foreign key errors between TARGET and T_ITEM, but is there a way that I can also be more specific and capture any join error between T_CC_ITEM_REF (reference table) & T_ITEM (dimension table). E.g. if T_CC_ITEM_REF is missing any mapping record or if any mapping is mapped to an inexistent T_ITEM row?

This is very good example of “indirect join” validation, where you want to validate some source data that is used in your mappings, but these tables are not directly associated with your target table. In these cases you cannot directly validate them in a regular mapping, but ODI has a very nice feature called Static validation, which allows you to run any validation in any data model at any time, so you could catch all those errors before trying to load your target table.

If you go to ODI models and expand any of them, you will see that you may add “Constraints” to it. Generally, we add those constraints to the target table, so data gets validated during the load data flow, before hitting the actual target table. I said “generally” it is done in the target because we have situations (like the one that we are talking now) where we want to validate the source/mapping tables even before we try to load the targets.

Before getting directly on how to solve this issue, lets step back and briefly see some options that ODI has to offer regarding data validation. If you go to ODI models and expand any of them, you will see that you may add “Constraints” to it.

1

The first one is a “Key”, which resembles a “Primary Key”. You will add all columns that belongs to your table primary key and ODI will make sure to log all records that are not unique based on this key.

The second one is “Reference”, which resembles a “Foreign Key”. You will add the columns and tables names that belongs to a foreign key relationship and ODI will automatically log all records that have no reference between that relationship. If we go back to the above question, this works fine to validate our TARGET table against the T_ITEM (dimension) because there is a direct link between them. ODI will log all source rows that does not have a valid “join” to the T_ITEM table before loading it to the target (so the target has only “valid” records regarding that FK).

Third one is called “Condition” and it is the most flexible of all ODI constraints. Basically, it is a free form text where you may add any kind of SQL statement which you want to check. You just need to remember that you always want to write down a statement that will check for TRUE values, as for example, you want to test if a column value exists, “is in” another table or if the values are greater or lesser than a specific value. When you execute your mapping, you will see that ODI will “negate” your condition adding a NOT before it, so it will check for all the records that are not TRUE (in other words, FALSE) and logging them at the E$ table. Pretty neat stuff.

Going back to our example, let’s divide our problem in two pieces. First let’s validate if T_CC_ITEM_REF contains any kind of bad data, meaning that we will check if it contains any mapping row that references a non-existing T_ITEM row. This can be achieved by going to T_CC_ITEM_REF and create a New Reference.

2

There you point the “Parent Table” (in this case, T_ITEM):

3

In attributes, add the two columns that are used in the join condition:

4

Save it. Go to the parent Model of this datastore and check its Control tab. You will need to select which CKM will be used to run the Static validation (you cannot leave it blank).

5

Create a new package and “drag and drop” the model to the package. It will look like this:

6

This icon indicates that, once you run the package, ODI will run a Static validation on this datastore using the CKM that we set in its parent Model. Once we run it, this is what we get in Operator:

7

The warning sign on the scenario execution tell us that some error was logged in the E$ table. When we go to check it, we can see following:

8

It means that ITEM_ID 50 does not exist in T_ITEM, so any source record with CC_ID equals to 5 would be dropped due to an invalid mapping row.

Now, what if we want to check if all CC_IDs from the source has a valid mapping record? This is where “Condition” constraint comes in handy, since it is very flexible and allow us to virtually write any kind of SQL logic in it. Let’s go to SOURCE_B and create a Condition to it:

9

Add a name to it and select a type. An Oracle Data Integrator Condition is a condition that exists only in the model and does not exist in the database. A Database Condition is a condition that is defined in the database and has been reverse-engineered. In our case, let’s pick ODI Condition. Write the SQL statement that you want to be true. In this case, I want all source CC_ID columns to be not null and that also exists in the join between my mapping/dimension tables. Another cool thing about conditions is that you may write custom messages for it, so it gets clearer to the users what that validation means.

10

Similarly, to the previous example, drag and drop the model to the package and run it. This is what you will get:

11

12

The error is telling us that there is a source record (CC_ID 3) that does not have a mapping record on T_CC_ITEM_REF. You may add both Static validations on your package, before you load your target and have a complete picture on what data is missing in your data flow.

Hope you have liked the post. See ya!

Oracle SQL for EPM Tips and Tricks S01EP08

Posted in Connect By, DEVEPM, Dimensions, EPM, Oracle, Oracle 11.2.0, Oracle 11.2.0.4, Oracle Database, Performance, REGEXP, Tips and Tricks with tags , , , , , on November 26, 2019 by RZGiampaoli

Hey guys how are you? Today I’ll expand a little bit more the REGEXP team just to show how easier and powerful these functions are, not to mentioning how dynamic.

In the last post S01EP04 I explained how to split a string in different columns. The thing is, what if you don’t want to split in different columns? what if you want just to have the string as a list in the same column?

We can also do that with a very nice trick using REGEXP_COUNT. Let’s take a look on this.

In the previous post we had a query like this:

As I said there, using REGEXP is very simple to “walk”trough the string and get any part of it you want. And is also simple to transform a string in a list of values. To do that we just need to add one extra connect by and the REGEXP_COUNT function like this:

What that CONNECT BY LEVEL <= REGEXP_COUNT(PATH, ‘[^|]+’) is doing is basically increasing by 1 for each “|” he finds in the string we have, in other words, he is dynamically “walking” through the string looking for “|” and passing the Level he find a “|” to the REGEXP_SUBSTR above. The result is:

As you can see, the REGEXP_COUNT will return the number of “|” he finds in the string, the CONNECT BY LEVEL will multiply that string by that number and the REGEXP_SUBSTR will “walk” the string using the LEVEL of each row to extract the list of members from that string.

This is very useful to manipulate strings that you don’t know the amount of members you want to extract from that, like UDA members for example, that when you extract from ASO cubes, comes in a string all together.

I hope this is a useful trick for you guys and see you soon.

ODI Hidden Gems – Target Load Order

Posted in Gems, ODI 12c, Tips and Tricks with tags , , on October 14, 2019 by Rodrigo Radtke de Souza

Hi all!

Today’s gem is related to a very nice surprise that all ODI developers had when they were migrating from ODI 11 to ODI 12, which is the ability to load multiple target tables with the same ODI mapping object. You may have a very large mapping and “split” (with Split object) the result set into multiple targets, or you may just copy the same result to different target tables, mapping different fields in each of the targets. This post is not about what you can do when loading multiple targets (which is kind a lot due to this feature in ODI 12c), but how to control the order which those target tables are loaded.

Let’s start with the following example:

1

If you execute this mapping, this is what it will look like in Operator:

2

You can see that it loaded the tables in an order that was decided by ODI, probably in the same order that you dragged and dropped the models into the mapping. Now imagine that you want to have control over the order and need to load Table B before table A. The classical example where you would use that is when you retrieve a source dataset and you want to derive both the dimension and fact information out of it. In this case, you would want to load the dimension table first and the fact table second.

Luckily, ODI allow us to change the order, but its not too intuitive.  In the Logical tab, you need to click in any area that does not contain an object (any white area will do). This will display the “Target Load Order” option:

8

Click on the “gear” icon in the far right and a menu will popup:

4

Now you may configure it to have any order that you want, let’s say table B, C and then A. If you save and run the mapping now, this is what you get:

5

6

That’s it folks! See ya!

ODI Hidden Gems – Begin/End Mapping Command

Posted in ODI 12c, Tips and Tricks with tags , on October 7, 2019 by Rodrigo Radtke de Souza

Hi all,

Today’s short post is about a simple, but very powerful feature that often is overlooked: Begin/End Mapping Command. These options are in the Physical tab and, as their name suggests, they may issue any kind of command before a mapping begins and/or after it finishes.

1

Pay close attention to the detail that they may execute ANY command from ANY technology that ODI may handle and that’s why it is so powerful. You may run anything from Oracle DML statements, a piece of Java code, trigger OS commands and so on. This gives you a lot of flexibility.

A very common example that we may use those are to “track” some mapping in a separate log table. Although you have ODI Operator that contains all the log information on it, sometimes we may get a requirement to track all the executions of a particular mapping, so people know for sure when it ran and that the logs will not be purged by accident from the Operator by someone. Let’s see how we may accomplish logging the start and end times of a execution.

Let’s start with “Begin”. First you select which technology and logical schema that command refers to. In this case, we will insert the name of the mapping, the time that it started, and which was the session number that it was assigned to in ODI.

2

Let’s do the same with “End”:

3

Let’s run the mapping. When we go to Operator, we may see that two new tasks were created, one before and another one after the main mapping:

4

We may double click it to see the code that was executed:

5

If we query the LOG_INFO table, we will see two entries, one for begin and another one for end:

6

This was a very short example as you may do way more than that. You may send emails to alert that a critical mapping has completed, you may zip and move a file after it was just loaded by the mapping, you may run an OS bat file that will prepare your enviroment before a data load and so on. These two options are a great alternative for us to get all these “small” codes inside the ODI mapping object itself and rely less on small ODI procedures.

See ya!