Archive for August, 2013

Dynamic topology configuration in ODI without using contexts

Posted in EPM, ODI Architecture with tags , , , , on August 12, 2013 by radk00

— EDITED on Aug 12, 2015 —

Great news! Oracle just released “Patch 21156142: Parametric connections in ODI 12c don’t work anymore” which enables this technic to be used in ODI 12.1.3! Thanks Marius for this great tip!!!!

— End EDIT —

— EDITED on May 6, 2015 —

Unfortunately, this technic only works on ODI 11 version. ODI 12c does not support dynamic password settings. We will let you know if we figure out a workaround for it.

— End EDIT —

Hi all, how are you doing? I came across this interesting situation during a project some time ago. Imagine the following scenario: You have a transactional system that was implemented in several places around the word. The system and its tables are exactly the same, but they are distributed across different databases, with different user names, database schemas, passwords, etc. Your task is to load data from this system with all these different database regions into your data warehouse, unifying all this data into one single place. Since all the tables have the same structures, how can you accomplish that with ODI without creating a lot of duplicate interfaces?

Everybody that already worked with ODI will easily answer that it may be accomplished using different contexts and they are totally right as we can see below:

image1

In resume, you create one interface object loading a table from the source system into your data warehouse and each context that you have created will point to a different database. It is just a matter to run that interface for each context and that’s it. However, there are situations where we cannot create that many contexts in a real production environment. In my case I couldn’t create those contexts because production execution repository is used for several other projects where those new contexts would just not make any sense at all. In production, the users have just one context called “Production” and they run any ODI job using that context, independently of the project which that ODI scenario belongs to. We could explain and teach the users to use multiple contexts, but my experience as a developer taught me that we need to let things as simple as possible to the end users, because there will be a day (for sure) that someone will execute a job using a wrong context and a mess may be created.

So, how can we accomplish this task without duplicating one interface per database that you need to load? Here is the technique:

Create one physical data server for each database that you need to load plus a dynamic data server:

image2

ADB_AMERICAS, ADB_APJ, ADB_EMEA and ADB_ZZ are the different databases that you need to connect and load data from. Those data servers are created as you normally create any data server in ODI and they will contain their own information regarding connection, data and work schemas and so on. In ADB_DYNAMIC you will create as the following:

Data Server:
User: #ODI_PROJECT.CONN_USER
Password: #ODI_PROJECT.CONN_PASS
JDBC URL: #ODI_PROJECT.CONN_URL

Physical Schema:
Schema: #ODI_PROJECT.CONN_DATA_USER
Work Schema: #ODI_PROJECT.CONN_WORK_USER

image3

image4 image5

As you can see, we have added five ODI variables in the topology information. These variables will be used to receive all the dynamic information from the different databases that we need to load. Now create one logical for each physical schema that you have created and point them accordingly in your context. Notice that we are also creating a logical schema called ADB_DYNAMIC that is going to point to our dynamic physical schema.

image13

Now this is an important step. Your models in ODI need to be pointing to ADB_DYNAMIC logical schema to make this work, but if you did not develop your interfaces yet and you also need to reverse your models and so on, you will not be able to do it pointing to this dynamic topology, since it has just ODI variables and not actual values there. So you will need to point your ADB_DYNAMIC logical schema temporally to a “normal” physical schema, let’s say ADB_AMERICAS to develop your interfaces. If you do this way, you will be able to reverse your models using ADB_DYNAMIC logical schema and you will be able to test all your interfaces pointing to one “valid” database without any additional work. After you complete your entire development, just point ADB_DYNAMIC logical schema back to ADB_DYNAMIC physical schema and proceed to the following step.

Now comes the tricky part. In order to make this technique to work, you will need to have a “control/parent” scenario that will call any “interfaces/child” scenarios passing as parameters all the connection information for each different database that you need to load. Let’s see an example:

image6

This child scenario contains all interfaces that will load your DW and it is receiving as parameters all connection information that will be used in ADB_DYNAMIC data server/physical schema. Remember that at this point, all interfaces are pointing to a physical schema that has only ODI variables on it, not actual values, so if you try to execute this package alone, it will not work. Now it is just a matter to call this scenario multiple times, one for each database that you need to load. First create a parent scenario with a procedure that will call those child scenarios as below:

image7

This procedure will have two steps for each database that you need to load. The first step will get one specific database connection and the second step will call the child scenario passing that connection information. It will look like this:

image8

If we open the first step, we will have the following:

image9

This step is creating Java variables that will temporally hold the connection information. These API commands are getting the connection information from the Command on Source tab, so in that tab we will have the following:

image10

We don’t need any code here. We just need the Schema combo box pointing to the desired database, in this case ADB_AMERICAS. This will allow all Java variables to get ADB_AMERICAS information that will be used in the next step which is the following:image11

This step will call the child scenario passing as parameters all the connection information that is needed to be used in ADB_DYNAMIC connection. When this child scenario runs, all interfaces inside of it will use ADB_AMERICAS connection information and will load data from it dynamically. Cool isn’t it? You are going to do the same configuration for the other steps in this procedure and the only thing that is going to change is the logical schema that you need to point in the “Command on Source” table. So “Get ADB_APJ Connections” will get ADB_APJ connection information and so on.

If you notice, this example is calling the child scenarios using SYNC_MODE=2, which means that they will execute the data load in all different databases in parallel, so if your architecture allows such parallelism, you may gain a great performance boost using this technique as well. We also added a “Wait for child scenarios” step as the last step, so we may control when all of the child scenarios have completed or not:

image12

So here is the end of this post. I hope you all enjoy!

Advertisements