ORACLE SQL for EPM tips and tricks S01EP03!
Hi all! Continuing the Oracle SQL for EPM series, today’s post is quite simple, but it may consume an extreme amount of time when we are requested to troubleshoot “why these numbers does not match” type of scenarios. Its related to UNION and UNION ALL operations. Let me describe what happened to me in one of those situations.
The client had a table with several columns that would calculate some metrics related to their
business. It was a “cumulative” type of table, where metrics were being aggregated by each previous period’s numbers. In a very resumed way, lets use the following example:
So, for Feb-19, the SUM would be 150 for Account 1 and 60 for Account 2. Next month, he would get the following:
His logic was summing the March period in Account 1 correctly (30) and summing it to previous 150. However, since Account 2 was not coming in March, his SQL was not reporting Account 2 in March. To make the calculations easier, he decided to add a “dummy” metric for all existing Accounts as 0, so his logic would calculate it correctly even it the record did not exist for that period. Something like that:
The process would still give his correct value of 30 in Account 1 for March and 0 for Account 2, which would then sum against the previous periods. It all worked fine, until someday someone complained that the numbers could not be right and some numbers were missing. When I checked the code, I quickly realized his mistake: he created his “dummy” metrics using a UNION in Oracle against his periodic metric and his “dummy” metric. But why it was giving the wrong numbers? Oracle explains:
- UNION combines the results of two queries, which eliminates duplicate selected rows. The UNION operator returns only distinct rows that appear in either result.
Let’s picture the problem. His logic worked fine for Feb and Mar, but in Apr, something like this happened:
If you sum Apr period for Account 1, the number should 80, but he was getting only 60 as below:
This is due to UNION’s behavior: It will run an implicit distinct in the combined dataset, which in this case is eliminating good data. I went ahead and changed the UNION to UNION ALL, which Oracle states:
- The UNION operator returns only distinct rows that appear in either result, while the UNION ALL operator returns all rows. The UNION ALL operator does not eliminate duplicate selected rows.
The result is the following:
Now it looks correct: 80 for Account 1 and 0 for Account 2.
That’s it folks! Simple things that may give us enormous headaches and wrong numbers, so please always check out when you see an UNION in the queries! It may be implicitly omitting some good data there.
See ya!
April 9, 2019 at 1:28 pm
[…] the Oracle SQL for EPM series, today we’ll extend a little bit the “Connect by” post to see some neat thing we […]