Tips and Tricks

Archive for Tips and Tricks

Building dynamic ODI code using Java variables – Table to File example

Posted in ODI with tags ODI, Tips and Tricks on March 23, 2023 by Rodrigo Radtke de Souza

Written on March 23, 2023 by Rodrigo Radtke de Souza

Hi all. Today’s post is an “enhancement” of one post that I did a long time ago regarding using Oracle’s metadata to build dynamic code. I say it is an enhancement because that technique would use the source/target command tabs and it has a 4000 character limitation. So, if you are trying to build some metadata information that crosses this limit, it will fail.

You should be good using source/target command technique in 90% of your data loads, but there are a few exceptions that we need to go beyond 4000 characters. One of these examples is when you are trying to build dynamic code to generate text file loads in ODI. The example that I’ll show here is about dumping the data from any Oracle table to a text file dynamically, using only one ODI procedure.

This is what we will build. I’ll go over each step in detail:

Basically, it is an ODI procedure that will receive a table name as a parameter and it will write that table data into a text file using pipe “|” as delimiters. This is a very small but useful procedure that you may use when you need to create fast text files from tables/views without going through all the work to create ODI mappings for it.You will need two Logical Schemas, one for the source table and another one for the target file. In this example I used LOCAL_SOURCE that points to an Oracle database and OUTPUT_FILE that points to a folder where the text file will be created.First step is to get the file metadata. As I said, this will be done in a different way now. Instead of using source/target tab commands, we will use Java BeanShell to retrieve this information. This is because ODI creates a very lengthy metadata for each column and the total amount of metadata often crosses the 4000 characters limit. Just as an example, each column in a file text load looks similar to this:

"CRDWG_COLSNP$CRCOL_NAME="+metaDataSrc.getColumnName(i) + "SNP$CRTYPE_NAME=STRINGSNP$CRORDER="+i+"SNP$CRLENGTH=4000SNP$CRPRECISION=4000SNP$CRACTION_ON_ERROR=NULLSNP$"

Here is the full code for the “Get File Metadata” step:

<@
import java.sql.*;

String table_name = "#DWCLOUD.L_TABLE_NAME";
String folder_path = odiRef.getSchemaName("OUTPUT_FILE","D")+"/"+table_name;
String folder_path_txt = folder_path+".txt";
String table_schema = odiRef.getSchemaName("LOCAL_SOURCE","D");
String table_metadata = "/*$$SNPS_START_KEYSNP$CRDWG_TABLESNP$CRTABLE_NAME="+table_name+"SNP$CRLOAD_FILE="+folder_path_txt+"SNP$CRFILE_FORMAT=DSNP$CRFILE_SEP_FIELD=0x007cSNP$CRFILE_SEP_LINE=0x000D0x000ASNP$CRFILE_FIRST_ROW=1SNP$CRFILE_ENC_FIELD=SNP$CRFILE_DEC_SEP=SNP$CRSNP$";
String table_header = "";
String table_header_parameter = "";

Connection ConSrc = DriverManager.getConnection("<%=odiRef.getInfo("SRC_JAVA_URL")%>","<%=odiRef.getInfo("SRC_USER_NAME")%>",odiRef.getInfo("SRC_PASS"));
Statement stmtSrc = ConSrc.createStatement();
ResultSet querySrc =  stmtSrc.executeQuery("select /*+ parallel(64) */ * from "+ table_schema +"." + table_name + " where 1=2");
ResultSetMetaData metaDataSrc = querySrc.getMetaData();

int count = metaDataSrc.getColumnCount();
for (int i=1; i<=count; i++)
  {
	table_metadata = table_metadata + "CRDWG_COLSNP$CRCOL_NAME="+metaDataSrc.getColumnName(i) + "SNP$CRTYPE_NAME=STRINGSNP$CRORDER="+i+"SNP$CRLENGTH=4000SNP$CRPRECISION=4000SNP$CRACTION_ON_ERROR=NULLSNP$";
	table_header = table_header+metaDataSrc.getColumnName(i);
	table_header_parameter = table_header_parameter+":"+metaDataSrc.getColumnName(i);  
    	 
	if (i==count)
	{
  	table_metadata=table_metadata+"CR$$SNPS_END_KEY*/";
	} else
	{   
  	table_metadata=table_metadata+"CRSNP$";
  	table_header = table_header + ",";
  	table_header_parameter = table_header_parameter + ",";    
	}
  }

ConSrc.close();
@>

How it works? It uses a variable (that you will use in an ODI package later on), to figure out which table it will dump into a file. It actually can be other objects that you can query, like a view. Views are actually great to be used here since you may create any kind of business logic and filters on the view and dump the data to a file when needed. It also uses both logical schemas (source and target) to figure out where to read the source data and place the target file.

It then creates a connection object using the source connection that you set in the ODI procedure (please pay close attention to each source/target logical schema in the ODI procedure) and it runs a query where 1=2 just to retrieve the table/view metadata.

Having this metadata in the result set, it loops through it and starts to “build” the dynamic code for the header and the columns. It does some special conditions when it reaches the last column and then it closes the connection.

It may sound complicated, but it’s fairly simple. In less than 40 lines of code we are able to retrieve all metadata to dump any table’s data to a flat file in a dynamic way. The other steps are now very straight forward.

“Create File” step will be as simple as this:

create table "<@=folder_path@>"
<@=table_metadata@>

Same for “Truncate File”:

truncate table "<@=folder_path@>"
<@=table_metadata@>

“Create Header”:

create header (<@=table_header@>)
<@=table_metadata@>

“Insert Rows” is the only one that needs two steps. In Source Command we have:

select     
    /*+ parallel(64) */ <@=table_header@>
from    <@=table_schema@>.<@=table_name@>
where    
    (1=1)

This one is used to read the source table. In Target Command we have the insert statement to the file object:

insert into "<@=folder_path@>"
(
	<@=table_header@>
)
values 
(
	<@=table_header_parameter@>
)
<@=table_metadata@>

Add this procedure in an ODI package, alongside with the ODI variable for the table name:

Add the table/view object that you want to dump the records. In this example, I used a table named SAMPLE_DATA. When I executed the package, I have the following:

If I go to the folder path defined in OUTPUT_FILE logical schema, I can see the data file created there (they are all fake/random rows, not real people information):

I hope you liked this post. See you soon!

1 Comment »

ODI Hidden Gems – Unique temporary object names

Posted in Gems, ODI, Tips and Tricks with tags Gems, ODI, Tips and Tricks on March 11, 2021 by Rodrigo Radtke de Souza

Hi all, I was not going to write about this one because I thought that this “hidden” gem was already known to every single ODI 12 developer out there, but I still get questions on why sometimes some specific data loads fails when they run in parallel, and they work fine when they run in serial. Most of the times, those are related to how ODI handles the temporary objects that it creates to do ETL (like C$, I$, E$ tables).

Let us see one example, which is the default in ODI. I have created one very simple mapping that has one source and one target table.

This mapping is loading data from two databases that resides in different data servers, so it will need to create a C$ table to be able to transfer the data. If we look at ODI Operator, we will notice the following:

The C$ table that it created is named as C$_0SECTIONTYPE. By default, ODI will create this name based on the source component that the data was generated from, so in this case it was a table called SECTIONTYPE. The “0” in front of it is an incremental number that would increase if you had another source with the same name in the mapping. For example, if you had SECTIONTYPE mapped twice as source tables, one would be loaded as C$_0SECTIONTYPE and the other C$_1SECTIONTYPE. ODI does that so we do not have a clash between names within the same ODI mapping.

However, what would happen if you tried to run the same mapping or another mapping that also contains SECTIONTYPE as source at the same time? As you may imagine, one mapping would interfere in the other, since both C$ tables would be called C$_0SECTIONTYPE and both mappings would be trying to load/read/drop it at the same time, which would cause a failure (in a good scenario) or wrong data (in a bad scenario).

To avoid this kind of issue to happen, ODI 10/11 developers were very creative in the past and would create some Java variables and some tweaks in some ODI KMs to make the temp table names dynamic. However, ODI 12 introduces something way simpler to handle this kind of situation, however, its not default and it is kind of “hidden”. If you go back to the mapping, click on the Physical tab, and scroll all the way down. You will notice a check box that says “Use Unique Temporary Object Names” that is unchecked. If you check this one and run it again, you will see the difference.

ODI now created a table named C$_0SECTIONTYPEAIHLNQMAVPK7Q1FR66UM225DF3, which is totally unique, and it will never clash with another mapping running in parallel. But then, another question arises: should I go to each mapping to check this option, if I want all to be unique? Well, the answer is no. There is another “hidden” gem that you can use.

Go to ODI Topology and double click your ODI agent, then go to Properties.

Luckily, you may enable the use of unique temporary objects at an ODI agent level, so you do not need to go back to each mapping and changing them.

That is it for today. See ya!

1 Comment »

Generate JSON objects with ODI in a very easy way

Posted in ODI, Tips and Tricks with tags ODI, Tips and Tricks on November 9, 2020 by Rodrigo Radtke de Souza

Hi all. I came across a requirement to create Json objects from a set of Oracle tables. We have ODI, so it was natural that the solution would be created there. However, working with JSON objects in ODI is not that easy. For those who already worked with “Complex File” technology knows what I am talking about. Too much setup and any misconfiguration will cause an error that is generally very hard to troubleshoot.

In the past, I did several mappings that would read from Complex Files (including Json), but this time it was an outbound process, so I would need to create the Json objects, not read from them. I tried to search on the internet, but nothing was clear. I was not sure if the Complex File technology would work to create outbound files and I was not in the mood to play with XSD files this time, so I needed to find some other solution.

Talking to a friend/co-worker of mine, he asked me if we could not leverage the Oracle’s JSON_OBJECT function somehow in ODI. First, I did not know what that function was about, so I researched about it:

The SQL/JSON function JSON_OBJECT takes as its input one or more property key-value pairs. It returns a JSON object that contains an object member for each of those key-value pairs.

It looked very interesting. So, I give it a try in SQL Developer. It worked very well:

Basically, you may add any number of columns to the function and it will create a valid JSON object out of it. Pretty neat, right? Now it is just a matter of how to add it to ODI, which thankfully is very easy. First, I created a model in the File technology:

It contains only one column, with 4000 characters. Then I created a mapping and mapped all the required columns to the JSON_OBJECT function, like this:

That’s it. Pretty simple. When we run the mapping, we have an outbound file like this:

Now you may send the outbound file to any application that needs to consume JSON objects. One thing that you may need to consider is that you will probably get into trouble if your Json objects ends up being larger then 4000 characters each. I didn’t test it, but either Oracle or ODI will probably complain about it.

I hope you have liked it. This is not a “traditional” way to create Json objects, but for sure it is the easiest one! See you next time!

Leave a comment »

Oracle SQL for EPM Tips and Tricks S01EP14

Posted in ODI, Oracle, Oracle Database, Performance, SQL, Tips and Tricks with tags Oracle ACE, oracle Database, SQL, Tips and Tricks on October 15, 2020 by RZGiampaoli

Hey guys how are you? Continuing our SQL series (S01EP13), today I’ll share a very hand little query that I use very often for check data duplication. In fact, this would be an upgrade version of ODI’s pk check.

An upgrade version because in ODI, if you enable PK check, if he finds duplication, he eliminate both data. This code I’ll show you, you would choose if you want to keep the last created duplication or the oldest one, but only one will be eliminated.

I have a test table with this values:

If I want to check for duplicate PK, I can just run this query here:

The Idea here is, we have 2 queries. The first one will check if the ROWID it has is bigger or smaller (you choice) than the MIN or MAX ROWID (depending of your previous choice) than the second sub query by any joins you want to check.

In this case, we wanted to check only if the PK column had duplicated values, but we could check any other column by just replace it in the join. In fact, we could have any amount of columns in the join and that would check if there’s any duplications in all columns you inserted there.

Then you can select the first using > and MIN or the last by using < and MAX as well you can select what column you want to check in the where clause.

One important thing to mention is that this query is meant to work as a delete because it’ll keep what was not in the select. What I mean is, if you have more than one duplication, it’ll bring, in this case, all the rows that has the ROWID > then the one selected in the first query:

Then if I have multiple duplications, the query will return everything that needs to be deleted and the only one remaining was the first one inserted (3, Chuck, Giampaoli).

I hope you enjoy this little trick and see you soon.

Leave a comment »

Oracle Always Free cloud offering (Part 2)

Posted in ACE, Autonomous Database, Cloud, Data Warehouse, Oracle, Oracle Database with tags Autonomous Database, Cloud, OCI, Oracle, oracle Database, Tips and Tricks on May 18, 2020 by RZGiampaoli

Hey guys how are you? Today I’ll continue to talk about the Oracle Always free cloud offering and I’ll try to summarize what you can do after your account is set up. If you want to know how to setup you account you can find it HERE.

After you receive an email saying everything is set you can login in your account and you’ll see a screen like this:

This is the main dashboard. Here’s where you’ll create your Database, your VM’s, convert your account to paid, manage your account, ask for help, etc… Let’s start with the main dashboard:

(1) Quick Actions: Here you’ll find the most important links as quick actions.
- (2)Compute: This is where you can create a VM to be used with your databases. You can use it to install tools and develop whatever you want inside the your environment.
- (3)Networking: Here’s where you set up your cloud network. This is the first step you must do to ensure your VM and databases will be in the same network and reaching each other.
- (4)Autonomous Transaction Processing: This is where you create a transaction database.
- (5)Autonomous Data Warehouse: This is where you can create your Data Warehouse database.
- (6)Search: A quick way to view all your resources.
(7)Account Center: Here’s a quick place to manage your account and see how many credits you have and billing information
(8)Main Menu: This is the main menu where you have access to everything that you can do inside your Cloud.

(9)Top Bar: Where you can change regions, in case you have more than one region, access the Cloud Shell (for OS commands), see the help, ask for help in the chat, change language and see your profile.
(10)Start Exploring: Here’s a place where you can find articles to help you start setting up your environment.
(11)What’s new: And finally here’s where you can see news about Oracle cloud, like releases and things that will be added.

One important thing to add here is that before you add anything or create anything, look for the “Always Free Eligible” logo or description to be sure you’ll not buying anything by mistake. Now about the main menu:

Core Infrastructure: Here’s where you can set your VM’s, networks and storage options.
Database: Here’s where you can Set your databases options, backups and Servers (VM or Bare metal).
Data and AI: Here’s where you can set your Big Data and AI environment.
Solution and Platform: Here’s where you can set your Analytics cloud services, Integrations, monitoring and marketplace.
More Oracle Cloud Services: Here’s where you have other cloud services.
Governance and Administration: And here is where you can administrate your environment like provisioning security, Account Management, Identity and Governance.

As you can see there’s a lot that can be done, but we’ll concentrate in the “Always Free” content, but the following list summarizes the Oracle Cloud Always Free-eligible resources that you can provision in your tenancy:

Compute (up to two instances)
Autonomous Database (up to two database instances)
Load Balancing (one load balancer)
Block Volume (up to 100 GB total storage)
Object Storage (up to 20 GiB)
Vault (up to 20 keys and up to 150 secrets)

In the next post we’ll setup our environment. See you soon guys.

1 Comment »

Fragmented and Aggregated tables in OBIEE using ODI Part 2/5: Managing the partitions of Fragmented/Aggregated tables

Posted in OBIEE, ODI, ODI Architecture, Oracle, Oracle Database, Performance, SQL, Tips and Tricks with tags OBIEE, ODI, ODI Architecture, Tips and Tricks on February 10, 2020 by RZGiampaoli

Hey guys, how are you?

Continuing the series Fragmented and Aggregated tables in OBIEE and today we are talking about how to Populating the Fragmented tables using ODI.

Just to make easier for you to navigate in this series, here’s the parts of it:

Creating the tables: I’ll explain the advantages and how you can create the fragmented tables and also how to further improve performance using partitioning and sub partitioning.

Managing the partitions of the Fragmented/Aggregated tables: Here you’ll see how to manage all partitions using ODI.

Populating the Fragmented tables using ODI: Here you’ll find how to create generic components to load all fragmented tables we need to.

Populating the Aggregated tables using ODI: Here you’ll find how to create generic components to load all Aggregated tables we need to.

Setting the OBIEE Repository: Finally, we’ll going to setting up the OBIEE repository to make use of all tables.

In my previous post we design 18 tables and its partitions and sub-partitions. Now what we need to do is to make ODI manage these partitions for us. As I said before, the Sub partitions will be static and if we need to create one more sub-partitions, we’ll be doing that manually. That’s always the best choice because manage sub-partitions add a layer of complicity that is not needed.

With that said, if you need to manage the sub-partitions, you can use the exactly same approach I’ll show here just adapting to the needs of the sub-partition’s creation and sub-partition template update.

Let’s start analyzing what needs to be done. For each table we need to issue this command to the database:

For List partitions:

ALTER TABLE table_name ADD PARTITION partition_name VALUES (value[, value]…);

For Range Partitions:

ALTER TABLE table_name ADD PARTITION partition_name VALUES LESS THAN (value[, value]…);

Sounds hard but in fact is very simple and it’ll going to work in any SQL you have. To do that this we will use a procedure with our ALTER TABLE statement in the Command on target tab and a SELECT in the command on source to pass the information we need to create the partitions like the table names, the partition name and the partition values.

For each row that the SELECT on source returns, the statement in the Command on target will be executed, and we can pass the values from the SELECT in the Command on Source to change the behavior of the statement in the Command on target.

This is something very important that we need to understand about ODI. ODI It’s not an ETL tool but a code generator tool that orchestrate Database and a lot of other tools if you will.

What I mean by that is, until ODI send the code for the database to execute, everything that ODI does is to generate code. If we use the ODI API it’ll replace it before send to the database as well to replaces the variable with their values, meaning a variable like this #SQL on the Command on Target tab could contain an entire dynamically generate INSERT that ODI would replace before send it to the Database.

And this is why ODI is so powerful, because we can manipulate string and make everything dynamic.

Then our query in the source should return something like this:

The query will change depending of your design of the application then I don’t think sharing my query will going to help. One tip I can give you is always compare the name of the partition with the content of the column PARTITION_NAME on ALL_TAB_PARTITIONS view to see if the partition you want to create already exists to prevent an error in ODI, in fact you have 2 choices, either do this or flag the step as Ignore Error to end the step with a warning. I prefer to compare and end the step without warnings, then I compare.

Another thing to notice is that for FY20 Feb, my PARTITION_VALUE (MAX_PERIOD) is the 1 day after the last day of the month. It’s hard to see here because the calendar used by the client is all different but if you have the normal calendar and the mask you have for you PERIOD_ID is YYYYMMDD, for 2020/Feb you partition ID needs to be 20200230 (Feb ends 02/29/2020).

That’s because this is a RANGE partition, and for RANGE partition we define the partition value as VALUES LESS THAN, what means that everything under 20200230 will be FEB unless we have a smaller partition (JAN) below it. But for example, if you smallest partition is 2018/Jan, everything below that period will be inserted in the 2018/Jan partition, then you need be careful with when you’ll start to create partitions.

Another important thing to mention is that you cannot insert a partition smaller than an existing partition, just bigger. That means, if you start with 2018/Jan, it’s impossible to create a 2017 partition, the only way is to drop the table and create the table starting by 2017. A RANGE partition needs to be sequential and always growing.

Having these values returning from the SELECT on the Command on source tab, the only thing that is left is to use these variables in the statement in the command on Target tab like this:

ALTER TABLE #TABLE_NM ADD PARTITION PARTITION_NM VALUES LESS THAN (#MAX_PERIOD);

Pay attention on the values of the partitions. In this case the values are numeric, that is why I’m not using quotes. If the values were String, we need to enclosure the variable with quotes.

And that’s it, this is all we need to do to manage partitions using ODI. Because the SELECT on the source returns All tables with their partitions name and values and we pass to the TARGET statement this information, for each row in the source the target will be looped, creating the partitions you need for each table.

We are done for the fragmented tables. Now we need to manage the partitions for the Aggregated table. The approach is the exactly the same, the only thing that will change is the content of the query from the Command on Source tab and that the PARTITION now will be a LIST Partition instead of a RANGE partition.

We could have done just one query to return all values including if that was a RANGE or a LIST partition, but for the sake of simplicity is always better to split the scenarios to not over complicate something simple.

For the month level we need the query in the Source to return: