Archive for July, 2019

ODI Hidden Gems – Exception Handling – Timeout(s)

Posted in ODI, Tips and Tricks with tags , on July 26, 2019 by Rodrigo Radtke de Souza

Hi all,

Today’s hidden gem is the “Exception Handling – Timeout(s)” option which is located at the Load Plan steps:

1

There are certain situations where we may have a very strict load window, which we cannot go over a certain limit of time. If this situation happens, the data load should abort before something bad happens. Luckily, ODI Load Plans have a very easy mechanism to handle timeout situations and most people are not aware of it. In every load plan step, we may add a timeout value which is the maximum time (in seconds) that this step takes before it is aborted by the Load Plan. When a timeout is reached, the step is marked in error and the Exception step (if defined) is executed.

It seems simple, but it can be very powerful, since this setting may be applied to any parent step (even the root step). In this case, we may have a safeguard to avoid a potential long running/overlapping situation for the entire load plan. As for example, if you want to be sure that the entire load finishes within 8 hours, just add a timeout value to the root step (28800 seconds) and it will stop in case it reaches this value. In daily execution load plans, you may set it to 86400 seconds (24 hours), so it does not overlap with the next daily execution. In the following screen, I set a 10 seconds timeout, and this is the error that is triggered when it reaches the timeout setting value.

2

That’s if folks, see ya!

Advertisement

ODI Hidden Gems – Log Steps in the Journal

Posted in ODI, Tips and Tricks with tags , on July 11, 2019 by Rodrigo Radtke de Souza

Hi all,

Today we will talk about “Log Steps in the Journal” option. This one resides the “Advanced” tab of every step inside an ODI package, as you can see below. Every step in a package appears in the execution log while being executed, but we may define whether the step should be kept in the journal after its execution is finished or not. The available options are:

  • Never: the step is deleted from the journal.
  • Always: the step is always kept in the journal.
  • Errors: the step is kept in the journal only if it failed. Otherwise, it is deleted.

1

Although it seems a very simple option, it has some good PROS and CONS about using it. Let’s talk about the PROS first. Imagine that the above scenario was going to loop the same procs 10 times. You would end up with a log like this:

2

You may want to keep the log of all those procedures executions, so you know what they did (like how many rows did they insert/delete/update). However, all the steps related to the loop variable are kind of useless, since they are only used to control the “loop” over the steps. In this case, if you wish to keep a cleaner log, you may set both variable steps to “Never”, like below:

3

Now, when you execute this package, you will have a much cleaner log:

4

However, this option comes with a bad CONS: I already saw several situations where people were trying to debug an execution in Operator and they were not understanding how the values were being assigned or they were not sure how some stuff were being populated if they don’t see any step in Operator related to that. After some time lost wondering about it and then double checking the same package in development, they would realize that someone had put that step to never log (sometimes even by accident). So, anytime that you are trying to debug something in Operator, and it seems weird or missing pieces, please make sure to look on the development package as well, since some steps may be set to Never log.

That’s it folks. A quick post today. See ya!

ODI Hidden Gems – Degree of Parallelism for Target (DOP)

Posted in ODI, Tips and Tricks with tags , on July 5, 2019 by Rodrigo Radtke de Souza

Hi all,

If you read our posts, you know that we like to write “series” of them. We think it’s a good motivation for us to focus on some topic and keep writing about it. So, let me begin with a brand-new series called “ODI Hidden Gems”. We will be talking about those small configurations, check boxes and settings that most of the people just ignore them or don’t even know that they exist, but they can be of great value.

ODI is a great tool, it has a lot of options and anyone may survive without knowing all of them, however, there are some that may shine and gives you better data load performance, tool development usability and so on. Without further delay, let’s talk about Degree of Parallelism for Target (DOP).

1

This setting resides on ODI Data Server component within Topology together with Array Fetch Size and Batch Update Size and all the three are often misleading. First thing to notice is that each of those settings happens in either on SOURCE, TEMP (C$) or TARGET databases (and here is where the mislead happens).

  • Array Fetch Size: This setting is only used when the data server is used as a SOURCE. When reading large volumes of data from a data server, Oracle Data Integrator fetches successive batches of records. This value is the number of rows (records read) requested by Oracle Data Integrator on each communication with the data server.
  • Batch Update Size: This setting is only used when the data server is used as a TARGET. When writing large volumes of data into a data server, Oracle Data Integrator pushes successive batches of records. This value is the number of rows (records written) in a single Oracle Data Integrator INSERT command.
  • Degree of Parallelism for Target: Although the name suggests TARGET, this setting is only used on the TEMP (C$) part when the data server is used as a TARGET. Indicates the number of threads allowed for a loading task, in other words, in C$ population from the source database to the target database. Default value is 1. Maximum number of threads allowed is 99.

So, when you want to optimize all three parameters, you will probably change in two different data servers (source and C$/target) and not only in one data server, as most people try to do. Also, when we talk about ODI Data Server DOP, which is a number that represents the number of parallel threads, we are talking exclusively about the C$ piece of the integration, so it’s not related to SOURCE/TARGET at all.

Let me give you one example to make it clearer. If you are not aware from where I’m getting the following details, please notice that, every time you have a “Load Data” step from Server A to Server B, ODI creates a “Details” tab at the Operator Task Level with a lot of useful information. This is also another free Hidden Gem.

Source: SQL Server, TABLE_A has 2,261,393 rows
Target: Oracle, TABLE_B will be loaded from TABLE_A.
Topology: Target Data Server is set to DOP 1.

3

We can see that it took 75 seconds to load this data, but the Wait time on source connection was 27 seconds. It means that, ODI was able to get data fast from the source database, but it needed to wait for the target thread to be available, so it could send more data in. Also, target DOP is one, so only one thread worked to load this data to the C$ table.

Source: SQL Server, TABLE_A has 2,261,393 rows
Target: Oracle, TABLE_B will be loaded from TABLE_A.
Topology: Target Data Server is set to DOP 16.

5

Now we see some gain. The wait time is 0.610 seconds in the source and the target threads were able to load all of them to C$ table in 35 seconds, running 16 threads in parallel. You may even try to reduce this load times further by changing Array Fetch Size (in the source Data Server) and Batch Update Size (in the target Data Server), but those two settings I’ll leave to another Hidden Gem post.

See you later!