Archive for March, 2019

ODI “Command on Source” buffered behavior

Posted in ODI, Tips and Tricks with tags , on March 28, 2019 by Rodrigo Radtke de Souza

Hi all! This post was created based on a friend’s question to me a couple of days ago. He asked me the following:

  • I know that we may create a procedure with a SQL in “Command on Source” that would return N rows and trigger an OS command in the “Command on Target” tab for each of those rows, passing the results as a parameter. My process takes a while for each row, so I was wondering if I could insert more rows in the table that is being read on “Command on Source” while it is still executing, so it would pick the new rows as well in the same execution?

In other words, he wanted to trigger the ODI procedure once and keeps “feeding” the “Command on Source” table many times, so all his OS Commands would get executed in one procedure run. Instinctively I said no, because ODI needs to somehow “buffer” the “Command on Source” results (which may be a result of a SQL statement with different tables) and then start to run the “Command on Target” commands. He agreed with me and moved on. However, that keep in my head: what would happen if the source table somehow changes while the ODI procedure is running? What if more rows were inserted or if the table was truncated/deleted? I did some tests to make sure I gave him the right answer.

I created a procedure that contains only an ODI sleep command in the Target that will be executed for each row that comes from the source, like this:

1

Then in the Source I added the following:

2

I populated this table with 20 records and I executed the proc. As expected, it took 20 seconds to complete:

3

Then I did the following test: I executed the proc and right away I inserted 20 more rows in the target. As I thought, the procedure ended again in 20 seconds, not 40, which means that ODI really buffers the results before executing the Command on Target:

4

But this is a small number and maybe ODI can buffer all of it right away and maybe that’s why it worked. I looked in the Topology and the Array Fetch Size for this connection was set to 250:

5

I did another test with 1000 rows and decreasing the wait time from 1000 to 100 to see how it goes. Both executions (with 1000 rows and then with adding more 1000 rows in between) ended in 104 seconds (the four extra seconds may be a delay due to network, ODI usage and so on):

6

So, I changed my approach and tried something different. What if I add many rows (100,000), change the delay to ‘1’ and in the meanwhile I truncate the table? The result kind of surprised me:

7

Truncate is considered a DDL command, so that’s why the error is saying that the object does not exists anymore, although it’s still there in the DB (it does not exist in that session, as it was modified by a DDL command in another session). This test was not exactly what I wanted to test and yet it does surprises me because I was expecting that ODI would have already buffered all the results (the error happened after 51 seconds only) and it would not be sensitive to certain DDL commands. However, this may indicate that ODI does not buffer it all at once and when it tried to read the table again, it was already truncated. So, let’s do another test.

Next test I doubled the row amount and inserted 200,000 and kept the delay to ‘1’ and ran the procedure. While it was running, I tried to delete the rows instead of truncating the table. The delete ran fine, it took 18 seconds to delete (less than 51 seconds from truncate) and no errors happened this time. So, it seems that ODI buffered all the results (which were doubled) before the 18 seconds.

8

After the deleting and committing, the proc continued to run and finished around 200 seconds, as expected.

9

I did one more test to see if the time that it took to fail in the truncate test would increase if I increase the number of rows. I inserted again 200K rows (again, double amount from the previous truncate test), ran the process and truncated the table. It took the same 51 seconds. So, I believe that, although ODI can buffer all the results before 51 seconds, Oracle somehow tells the process that the table was changed by a DDL command and sends a “stop” signal to the connection from a specific amount of time. I don’t see any other explanation for this behavior.

I could run some other tests, especially to see how large is the ODI buffer size, but as for now I’m ok with the results. In resume we figure out that:

  • ODI does buffer the results in the “Command on Source/Target” and we cannot modify them once it starts;
  • Although the results are buffered, the ODI procedure may fail if DDL commands are issued to the source tables;
  • DML commands does not seem to affect the buffered results, as expected;

If any of you has done some tests like this, please share with us! This kind of things are never documented, so we need to keep testing to see how they work behind the scenes.

Thanks! See you soon!

Advertisement

ORACLE SQL for EPM tips and tricks S01EP03!

Posted in Query, SQL, Tips and Tricks with tags , , on March 26, 2019 by Rodrigo Radtke de Souza

Hi all! Continuing the Oracle SQL for EPM series, today’s post is quite simple, but it may consume an extreme amount of time when we are requested to troubleshoot “why these numbers does not match” type of scenarios. Its related to UNION and UNION ALL operations. Let me describe what happened to me in one of those situations.

The client had a table with several columns that would calculate some metrics related to their

business. It was a “cumulative” type of table, where metrics were being aggregated by each previous period’s numbers. In a very resumed way, lets use the following example:

1

So, for Feb-19, the SUM would be 150 for Account 1 and 60 for Account 2. Next month, he would get the following:

2

His logic was summing the March period in Account 1 correctly (30) and summing it to previous 150. However, since Account 2 was not coming in March, his SQL was not reporting Account 2 in March. To make the calculations easier, he decided to add a “dummy” metric for all existing Accounts as 0, so his logic would calculate it correctly even it the record did not exist for that period. Something like that:

3

The process would still give his correct value of 30 in Account 1 for March and 0 for Account 2, which would then sum against the previous periods. It all worked fine, until someday someone complained that the numbers could not be right and some numbers were missing. When I checked the code, I quickly realized his mistake: he created his “dummy” metrics using a UNION in Oracle against his periodic metric and his “dummy” metric. But why it was giving the wrong numbers? Oracle explains:

  • UNION combines the results of two queries, which eliminates duplicate selected rows. The UNION operator returns only distinct rows that appear in either result.

Let’s picture the problem. His logic worked fine for Feb and Mar, but in Apr, something like this happened:

4

If you sum Apr period for Account 1, the number should 80, but he was getting only 60 as below:

5

This is due to UNION’s behavior: It will run an implicit distinct in the combined dataset, which in this case is eliminating good data. I went ahead and changed the UNION to UNION ALL, which Oracle states:

  • The UNION operator returns only distinct rows that appear in either result, while the UNION ALL operator returns all rows. The UNION ALL operator does not eliminate duplicate selected rows.

The result is the following:

6

Now it looks correct: 80 for Account 1 and 0 for Account 2.

That’s it folks! Simple things that may give us enormous headaches and wrong numbers, so please always check out when you see an UNION in the queries! It may be implicitly omitting some good data there.

See ya!

ORACLE SQL for EPM tips and tricks S01EP02!

Posted in ACE, Connect By, EPM, Oracle Database, Performance, Query, SQL, Tips and Tricks, WITH Clause with tags , , , , , , , on March 21, 2019 by RZGiampaoli

hey guys how are you? Let’s continue the SQL for EPM series. Today I’ll continue to talk about With with a small bonus of Connect by :). let’s start.

A lot of people uses Connect By in a daily bases but as far I having seeing, most of then don’t know how to use it properly. I already lost count with the amount of people complaining about performance issue with Connect By.

The thing is, Connect By works a little different than everything else in Oracle. We can say that Connect By has 2 stages and we’ll see why I’m saying that with this example. Let’s get back to our metadata table and let’s do a Connect By to extract the Balance Sheet Hierarchy from the Juno application:

As we can see, inside this table we have more than one application and more than one hierarchies for each application. That’s ok, we just need to filter it in our SQL right?

If we filter the APP_NAME and the HIER_NAME we’ll get all accounts for that Application and this will generate 12,622 rows. By the way, this table has all metadata from all our applications and we always filter by APP_NAME and HIER_NAME to select what we want (the table is also partitioned and sub-partitioned by these 2 columns). It’s important to know that without filtering anything this table has:

Ok, now, if we want to get just the BS hierarchy we just need to do the Connect By right?

That works… perfect… or not? Well in fact, this the wrong way to use Connect by because what I said before, the 2 stages.

As you can see, this query took 25 sec just to return the first 50 rows. In a integration this will take way more time, in fact, if you join this table to a data table to do a SUM in the BS level, this will take ages to return.

The reason is that for the Connect by, first Oracle does everything that is after the word Connect by and after the word Start with and then, and only then, it does what is in the where condition. That means, first he did the connect by in those 2.260.372 rows (and they are all repeated) and then after all the processing, it filtered what we wanted, that is the APP_NAME and the HIER_NAME. Then the right way to use it is:

Now it looks way better. 0.375 seconds to do exactly the same thing as before, and the only thing I did was to move our filters to the right place. Now Oracle is filtering and doing the Connect by at same time.

Now, if you do a SYS_CONNECT_BY_PATH and want to get just the leaf (to have the complete path that the hierarchy does, you can filter the leafs in the where clause (and need to be there otherwise it’ll not have the entire hierarchy during the connect by). This is how:

Now you see that the connect by filtered what needs to be filter during the Connect by execution and afterwards, it filtered just the leafs (using the CONNECT_BY_ISLEAF that returns if a member is a leaf or not).

Also, i used the CONNECT_BY_ROOT to generate the Root member used in this query (BS) and the SYS_CONNECT_BY_PATH to generate the entire path of the metadata (Very useful to transform parent/child tables in generation tables using this Technic and a regexp [we’ll see this in another post]).

Ok, now that the “Bonus” is written, let’s talk about the WITH that was the main subject here. Even with this Connect by write in the right way with the filters in the right place, we can still improve the performance using WITH.

That’s right, the idea is to prepare our subset of data using WITH before we ask Oracle to do the Connect by and leave it as simple as possible. Let’s take a look:

This is by far the best way to use a Connect by clause. You can, instead of using WITH use a sub-query but I think this way is easier and more organised as well. Also, I know the time difference doesn’t look to big between the previous example and this one but when you join this with data and start to SUM everything, you’ll see a huge difference between this method and the previous one.

Also, some times Oracle get lost with the previous method making everything slower but with the WITH method, it never happens then I advise you start to use this.

I hope you guys enjoy this little tip and see you next time.

DEVEPM will be at Kscope19!

Posted in DEVEPM, Kscope, Kscope 19, ODI with tags , , on March 14, 2019 by Rodrigo Radtke de Souza

Hi all, how are you doing? We are very happy to announce that once again DEVEPM will be at KScope! We are very honored to be selected to present on the best EPM conference in the word! We got one presentation and a panel in, so here is what we are going present at Kscope19:

OAC and ODI! A Match Made in…the cloud?

  • OAC stands for Oracle Analytics Cloud Services, and it’s another cloud solution offered by Oracle. It provides you a lot of analytic tools for your data. The question is, do you need to be 100% cloud to use OAC services?
    Well, with ODI we always have options, and for OAC that is not an exception.
    In this presentation we’ll take a look at three different ways to use ODI to integrate all your data with OAC, ranging from using your existing on-premises environment to a 100% cloud solution (no ODI/DB footprint in your environment).

205, Level 2 => Tue, Jun 25, 2019 (09:00 AM – 10:00 AM)

EPM Data Integration Panel

Is there a functional issue that you’ve been trying to solve with Cloud Data Management or FDMEE that you just can’t seem to break through on? Are you about to kick off a new project or phase and need to validate that Data Management or FDMEE is the right tool for your needs? Are you about to subscribe to an Oracle EPM SaaS offering and want to know your data integration options? Or do you just want to give some feedback about the product and features that will help you increase your utilization of it?

201, Level 2 => Wed, Jun 26, 2019 (11:45 AM – 12:45 PM)

Kscope is the largest EPM conference in the world and it will be held in Seattle on June 2019. It will feature more than 300 technical sessions, five symposiums, deep dive sessions, and hands on labs over the course of five days.

ys3header

Got interested? If you register by March 31st you’ll take advantage of the Kscope early bird rates. Don’t waste more time and let’s be part of the greatest EPM event in the world. If you are still unsure about it, read our post about how Kscope/ODTUG changed our lives! Kscope is indeed a life changer event!