Archive for the Uncategorized Category

Essbase Statistics DW: How to automatically administrate Essbase using ODI (Part 9: Analyzes)

Posted in Uncategorized on February 23, 2021 by RZGiampaoli

Hey guys, how are you? Continuing the Essbase Statistics DW: How to automatically administrate Essbase using ODI series (Part 9 Here), today we’ll talk about the analyzes we can do with all the data we have so far.

With all this information in our DW we can do a lot of things. We could automatically set the caches using ODI and Maxl or Esscmd but change settings in Essbase is more complicated than just an update.

We need to restart the applications, we need to take in consideration the design of each cube, we need the feeling and knowledge of the administrators, but can be done.

In fact, we can create thresholds to see if the cache is 20% off what it supposed to be and then apply the changes at night. We can also start to record the sessions (be aware that the tables get very big very quick), and analyze the retrieve/calculation times to see if the performance is getting better or not and them send a alert in case it was not after a automatic chance or something like that.

But what we can really do is provide precise information about each cube. We send a daily email for each administrator saying the actual stats for each cube, the actual setting of each cache, the recommended size, and we also send a table with more information about that application.

Attached we can add an excel with the raw data for further analysis.

This helps a lot the administrators to have a better understanding of our cubes and to decide and plan better maintenance strategies.

Of course, we can have a dashboard (and this would be the best approach) with all the statistics there showing the growing of the cube, projections, alerts of when the cubes will reach the maximum cache size, well, anything that could be needed.

The important thing here is to realize how useful this DW would be for keep our Essbase cubes healthy. And keep in mind that is a lot of other stats that can be gathered from Essbase that can do a lot for us.

For example, we can get the dimension settings and number of members and use ODI (or any language like java) to manipulate the outline and test the best Sparse/Dense settings for our cubes.

I hope you guys enjoy our series, be safe and see you soon.

Essbase Statistics DW: How to automatically administrate Essbase using ODI (Part 4: Dynamic Calculator Cache)

Posted in Uncategorized on February 4, 2021 by RZGiampaoli

Hey guys, how are you? Continuing the Essbase Statistics DW: How to automatically administrate Essbase using ODI series (Part 3 Here), today we’ll talk about Dynamic Calculator Cache.

The Dynamic calculator cache basically it is a buffer to store all uncompressed blocks on memory to calculate the dynamic members in the dense dimensions. Expanded Blocks including: Store, Never Share, Dynamic Calc members and dynamic time series members.

This is important because show how everything is related in Essbase. If you set the CALCLOCKBLOCK to 1000 you need to be sure that the data cache can hold 2000 uncompressed blocks. If not doesn’t matter the setting you put, Essbase will put half what fits in the data cache

We also need the size of the expanded blocks is all the store members plus the dynamic calc members and dynamic time series members together plus 8.

We also need the largest expanded block across all database on the machine and the maximum number of expected concurrent users. This can be analyzed by gathering the session information into our DW as well, but for this post it’ll be a constant number.

This information can be acquired by:

EASMAXLESSCMDJAVA
Data Cache Right click in a cube edit properties->Storagequery database sample.basic list all file information;listfiles “” “sample” “basic”;maxl.execute(“query database ” + s_application + “.” + s_cube + ” list all file information”);
Amount of members Right click in a cube edit properties->DimensionsNoneNoneselection.executeQuery(dimension.getName(), IEssMemberSelection.QUERY_TYPE_DESCENDANTS, 0, null, null, null);

In resume Dynamic Calculator Cache is:

  • A buffer in memory that Essbase uses to store all the blocks needed for calculate a Dynamic Calc member in a dense dimension
  • To find the optimal size of this cache we need:
    • CALCLOCKBLOCK size: it is half the number of expanded blocks that fit into the data cache
    • Expanded Blocks including: Store, Never Share, Dynamic Calc members and dynamic time series members
    • The largest expanded block across all databases on the machine.
    • The maximum number of expected concurrent users
      • Can be analysed by gathering the session info into a table and analyse the patterns but for this presentation is a constant number based in experience
  • To calculate the Maximum Calculator Cache, we need to multiply:
    • CALCLOCKBLOCK: (Data Cache in bytes (already calculated) / Size of the expanded Block in bytes) / 2
    • The largest expanded Block in bytes on the server
    • The maximum number of expected concurrent Users (Constant)

And that’s it, we talked about the 5 most important caches in Essbase, how do we calculate them and how we get them.

So far, we talked about these caches and how they are calculated:

Cache TypeDescription
IndexA buffer in memory that holds index pages (.ind files)
Data fileA buffer in memory that holds compressed data files (.pag files). Essbase allocates memory to the data file cache during data load, calculation, and retrieval operations, as needed. (only when direct I/O is in effect)
DataA buffer in memory that holds uncompressed data blocks. Essbase allocates memory to the data cache during data load, calculation, and retrieval operations, as needed
CalculatorA buffer in memory that Essbase uses to create and track data blocks during calculation operations
Dynamic calculatorA buffer in memory that Essbase uses to store all of the blocks needed for a calculation of a Dynamic Calc member in a dense dimension

And here we have a resume of all calculations needed to have the exactly amount of cache per application we have:

Cache TypeDescription
Indexnumber of existing Blocks * 112 bytes = the size of database index
Data fileCombined size of all essn.pag files, if possible; otherwise, as large as possible
Data0.125 * the value of Data File Cache size
CalculatorBitmap size in bytes * Number of bitmaps:
Bitmap size in bytes: Max((member combinations on the bitmap dimensions / 8), 4)
Number of bitmaps: Maximum number of dependent parents in the anchoring dimension + 2 constant bitmaps
Dynamic CalculatorC * S * U:
C is the value of the appropriate CALCLOCKBLOCK setting. (Data Cache in bytes / Size of the expanded Block
S is the size of the largest expanded block across all databases on the machine. Multiply the number of members in each dense dimension together and multiply by 8 bytes
U is the maximum number of expected concurrent users

From now on we’ll going to see how we can create a process to extract and handle this information and how can we create a DW and use it to keep our Essbase apps with the right cache all the time.

I hope you guys enjoy it and see you soon.

Essbase Statistics DW: How to automatically administrate Essbase using ODI (Part 3: Calculator Cache)

Posted in Uncategorized on February 3, 2021 by RZGiampaoli

Hey guys, how are you? Continuing the Essbase Statistics DW: How to automatically administrate Essbase using ODI series (Part 2 Here), today we’ll talk about Calculator Cache.

The concept behind the Calculator cache is a little tricky but I think is very nice to understand since it gives us a very good tip about Essbase design. This cache is used to track what Essbase already calculated and what he didn’t yet.

Essbase split the sparse dimensions in 2 groups:

  • Bitmap dimensions: they are all the sparse dimension that fits inside the bitmap
    • You can think about this bitmap as a matrix that will hold all members from one or more sparse dimension, depending of the available memory. Each member occupies 1 bit of memory. (it’s like a check that says if the member was calculated or not)
  • Anchoring dimensions: For the rest of the sparse dimensions that didn’t fit in the bitmap. The anchoring dimension will define the number of bitmaps that Essbase can create at same time, if there’s enough memory.

The calculator cache controls the size of the bitmap, what means, the number of sparse dimensions that can fit in the bitmap and the amount of anchoring dimensions depends of the size calculator cache

But how it works? Essbase will start from the first sparse dimension in the outline and will try to fit as much dimensions possible inside the bitmap until he cannot put the next sparse dimensions completely inside the bitmap.

All the remain sparse dimension becomes anchoring dimensions.

The number of bitmaps that can be created at same time is determined by the maximum number of dependent parents in the anchoring dimensions.

Dependent parent is the number of parents that is above a leaf member. We need to find the member that has the biggest number of levels above him, meaning, the max number of dependent parents is the same as the number of Levels in a dimension

If the leaf member with most parents above him has 3 parents the level of that dimension will also be 3, then for dependent parents we just need to ask for the level of the dimensions.

The size of the bitmap and the number of anchoring dimensions depends of the size of the calculator cache.

What that means? Let’s consider this outline for example:

Sparse Dims# of MembersDependent Parents
S110NA
S210NA
S320NA
S430NA
S52003

In this outline we have 5 sparse dimensions, 4 flat ones (S1 to S4) and one with a 3-level hierarchy (S5).

We have 3 options of cache settings that we can choose from:

  • Single anchoring multiple bitmaps.
  • Single anchoring single bitmap.
  • And multiple anchoring single bitmap.

To use the first option, we need to get the first 4 sparse dimensions from the outline and multiply all of them, split the result by 8 to transform it in bytes and this will define the size of the bitmap.

After that we need to define how many bitmaps we will have at same time in memory. For that we need to get the level of the anchoring dimensions and sum 2 on it. (2 is a constant number). This will define the number of bitmaps.

Finally, we just need to multiply the size of the bitmap by the number of bitmaps and this will give us the size of the cache we must set to use option 1.

The other options are a smaller version of the first one. Option 2 we don’t have multiple bitmaps, we just set the anchoring to 1, and option 3 we fit in the bitmap less dimensions.

We can see the calculations for each option below:

OptionCalculator Cache
Single anchoring, multiple bitmaps(10*10*20*30)/8=7500
3+2=5
7500*5=37500 bytes
Single anchoring, single bitmap(10*10*20*30)/8=7500
1
7500*1=7500 bytes
Multiple anchoring, single bitmap(10*10*20)/8=250
1
250*1=250 bytes

The interesting thing here is to see why is important to have our outline designed as hourglass for this cache. If we put the most complex sparse dimension in other position than the last one, we will never be able to have concurrent bitmaps and the size of the bitmap will be huge.

Then after this trick explanation the only information we need to calculate the calculator cache are:

  • The amount of member in each sparse dimension.
  • The amount of levels in the biggest sparse dimension.
    • The biggest sparse dimensions should be placed in the last place on the outline.

All this information can be acquired by:

EASMAXLESSCMDJAVA
Amount of members Right click in a cube edit properties->DimensionsNoneNoneselection.executeQuery(dimension.getName(), IEssMemberSelection.QUERY_TYPE_DESCENDANTS, 0, null, null, null);
Levels Double click in outline-> count the parentsNoneNoneIEssDimension dimension = (IEssDimension) otl.getDimensions().getAt(i);     IEssIterator level = dimension.getLevels();

Then resuming everything we saw in this post:

  • Calculator cache is a buffer in memory that Essbase uses to create and track data blocks during calculation operations
  • Essbase split the sparse dimensions in 2 groups:
    • Bitmap dimensions: Any sparse dimensions that can be fit inside the bitmap. Each member combination placed in the bitmap occupies 1 bit of memory, and the entire dimension must fit inside the bitmap.
    • Anchoring dimensions: The remaining sparse dimensions that do not fit into the bitmap.
  • Because the calculator cache controls the size of the bitmap, the number of sparse dimensions that can fit in the bitmap and anchoring dimensions depends on the size of the calculator cache
  • Essbase starts with the first sparse dimension in the outline and fits as many as possible into the bitmap.
  • Essbase stops the process when it cannot fit another complete sparse dimension into the bitmap.
  • All the remaining sparse dimensions becomes anchoring dimensions
  • The number of bitmaps used is determined by the maximum number of dependent parents for any members in the anchoring dimension
  • The max number of dependent parents is the same as the number of Levels in a dimension
  • Depending of the cache setting we can have 3 options
    • Single anchoring multiple bitmaps.
    • Single anchoring single bitmap.
    • And multiple anchoring single bitmap.
  • And to calculate everything we just need 2 things:
    • The amount of member in each sparse dimension.
    • The amount of levels in the biggest sparse dimension.
      • The biggest sparse dimensions should be placed in the last place on the outline.

I hope you guys enjoy it, be safe and see you next time.

Essbase Statistics DW: How to automatically administrate Essbase using ODI (Part 2: Data File Cache and Data Cache)

Posted in Uncategorized on February 2, 2021 by RZGiampaoli

Hey guys, how are you? Continuing the Essbase Statistics DW: How to automatically administrate Essbase using ODI series (Part 1 Here), today we’ll talk about Data File Cache and Data Cache.

The data file cache is deprecated, and I don’t know if much people use it, but it’s used to define the maximum size of the data cache. The data file cache is used to hold compressed data files (essn.pag) in memory and is used only when direct I/O option is in effect. To find it maximum size we need to sum all essn.pag files, if possible; otherwise, as large as possible.

We can find this information in EAS edit properties, storage or Maxl query database file information or in Esscmd list files or java using Maxl as we can see here:

EASMAXLESSCMDJAVA
Right click in a cube edit properties->Storagequery database sample.basic list all file information;listfiles “” “sample” “basic”;maxl.execute(“query database ” + s_application + “.” + s_cube + ” list all file information”);

Now let’s talk about the data cache. The name looks almost the same, but his usage is very different. The data cache is used hold uncompressed data blocks in memory. It has a direct relationship with the Data file cache and the recommended size would be 12.5 % of the summing of all essn.pag files (or 0.125 * Data File Cache size. That is why we even talked about the data file cache and calculated its size before.

The same thing that happens with the index cache happens here with the data cache. If we have too much historical data, we may decrease the optimum size to fit the data usage.

We gather this information in the same way as the data cache file:

EASMAXLESSCMDJAVA
Right click in a cube edit properties->Storagequery database sample.basic list all file information;listfiles “” “sample” “basic”;maxl.execute(“query database ” + s_application + “.” + s_cube + ” list all file information”);

Then to resume what we talked about in this post:

  • The data file cache is used to hold compressed data files (essn.pag) in memory and is used only when direct I/O option is in effect.
  • It is calculated by summing all essn.pag files, if possible; otherwise, as large as possible.
  • The data cache is used hold uncompressed data blocks in memory.
  • The recommended size would be 12.5 % of the summing of all essn.pag files (or 0.125 * Data File Cache size).
  • Same case as Index cache, just the working blocks are used (no historical).

I’ll stop here since we have a big one coming and I want to address the Calculator Cache alone in the next post.

Be safe and see you soon.

Essbase Statistics DW: How to automatically administrate Essbase using ODI (Part 1: Index Cache)

Posted in Uncategorized on February 1, 2021 by RZGiampaoli

Hey guys, how are you? Today we’ll going to talk about how to automatically administrate Essbase using ODI.

As you guys know, in order to have good performance in an Essbase cube, we must keep vigilance and follow up its growth and its data movements, so we can distribute caches and adjust the database parameters accordingly. But this is a very difficult task to achieve, since Essbase statistics are not temporal and only tells you the cube statistics is in that specific time frame.

The performance of an Essbase app has a direct relationship with the configuration setting that we set and must be based on the design and usage of each cube. To find out the optimal values for these configurations we need to look at the Essbase statistics.

The problem with this is that the Essbase statistics is a snapshot of the cube at that moment and every time we shut down a cube the run time statistics are cleared, meaning that these statistics are more or less accurate depending of the amount time the cube is online.

That also means that if we look in the statistics after a restart of the cube, we can get the wrong information and we can make a poor setting of caches and other things making the cube slow or even crashing them.

The best thing we can do to make this process more reliable is to create a historical DW with the Essbase statistics information, this way we can create a process to keep improving Essbase setting in a more pragmatically way extract, analyze, change setting and so on.

These are the most important caches that we can set, but we can do all kind of things with the information we’ll gathering from Essbase.

Cache TypeDescription
IndexA buffer in memory that holds index pages (.ind files)
Data fileA buffer in memory that holds compressed data files (.pag files). Essbase allocates memory to the data file cache during data load, calculation, and retrieval operations, as needed. (only when direct I/O is in effect)
DataA buffer in memory that holds uncompressed data blocks. Essbase allocates memory to the data cache during data load, calculation, and retrieval operations, as needed
CalculatorA buffer in memory that Essbase uses to create and track data blocks during calculation operations
Dynamic calculatorA buffer in memory that Essbase uses to store all of the blocks needed for a calculation of a Dynamic Calc member in a dense dimension

Let’s start with the Index cache. Index cache holds in memory the indexes that Essbase uses to find requested data.

Every time we retrieve or calculate any data from Essbase, first Essbase try to find that index in the index cache. If the index is not there, Essbase will find the index in the index files, get the data and store this index in the index cache memory to make it faster for the next time we need the same information. For each index that Essbase store there, it will take 112 bytes of memory.

Them to calculate the maximum size of this cache we just need to multiply the number of existing blocks by 112 bytes and that’s it.

Just to make sure everyone is in the same page: a Block is the combination of all store like members from all dense dimensions. Essbase creates a new Block when a new unique combination of data in the sparse dimensions is inserted/calculated.

This table shows the different ways to get the information we need from Essbase. As you can see, we don’t need to know java/C++ or any language to do extract this information, we can just use maxl, esscmd or even manually using EAS.

EASMAXLESSCMDJAVA
Right click in a cube edit properties->Statisticsquery database sample.basic get dbstats data_block;select sample basic; getdbstats;(IEssCube) cube.getPropertyNames();

One interesting thing about this cache (and almost every cache in Essbase) is that, as I said, it holds in memory the indexes that Essbase used at least once to find the requested data. That means, we will only find in the cache, indexes that we used at least once since the last time the cube starts.

Then, if we have a cube with historical data, the chances of these old data never hit the index are bigger then the new data. We need to take this in consideration when we set this cache.

Historical Data Usage

In this case, 70% of the cube contains historical data that are rarely requested, meaning, we need just 30% of the maximum size for the index cache.

A way to figure out if the cache is well set is looking at the hit ratio in EAS, but we still have the issue that this info reset every time when the cube restarts.

If we are resuming what we just said about Index Cache, we have the follow:

  • The index Cache is used to hold in memory all used indexes since the application is turned on.
  • The maximum index cache size is calculated by multiplying the number of existing Blocks * 112 bytes
    • A Block is the combination of all store like members from all dense dimensions. Essbase creates a new Block when a new unique combination of data in the sparse dimensions is inserted/calculated
  • The optimal index cache will vary depending of the design and use of the cube.
    • If the cube has a lot of historical data, the chances of these index never hit the cache are big
    • For that we can define a historical percentage that we can remove from the maximum index cache to find a more realistic cache.
    • This historical percentage can be obtained by analyzing the outline and see how many members are in use

That’s it for today. Next post we’ll talk about the Data File Cache. Stay safe and see you soon.

How to use Jython to send a dynamic HTML table email from ODI (part 2/2)

Posted in Uncategorized with tags , , , , on May 5, 2020 by RZGiampaoli

Hey guys how are you? I hope you guys are not insane after this 2 months of quarantine. Anyway, is time for us to finish the send email job. In the previous post HERE I explained the Jython code and the HTML code that we need to use to create our HTML table in our email. Today we’ll going to do it become dynamic.

As we saw, for every row we want it we need to have a block of HTML code that will draw the table, color the table and write the content of the cell in our table. We need this to change dynamically if we want to be useful for us, and to do that we need to write a SQL code to create this HTML code for us.

In my case I generate this code here to be my header:

<TR>
<TH COLSPAN="16" ALIGN="CENTER">Restatements Control - Group ID: 1</TH>
</TR>
<TR>
      <TH>Session Name</TH>
      <TH>Interface Name</TH>
      <TH>Year</TH>
      <TH>Feb</TH>
      <TH>Mar</TH>
      <TH>Apr</TH>
      <TH>May</TH>
      <TH>Jun</TH>
      <TH>Jul</TH>
      <TH>Aug</TH>
      <TH>Sep</TH>
      <TH>Oct</TH>
      <TH>Nov</TH>
      <TH>Dec</TH>
      <TH>Jan</TH>
      <TH>Adj</TH>
</TR>

This is saying that I’ll have 16 columns (COLSPAN=”16″) with Center alignment and the name of the table will be “Restatements Control – Group ID: 1” (where the 1 will be dynamically generated as well).

Now we first need to write a query to get this info for us. Since this is a very project related query, I don’t think it’ll do any good for you guys to put my query here, but I’ll explain what I was looking for. First I’m querying the ALL_TAB_PARTITIONS to get all partitions related with that table. Then I was querying a control table that every time the jobs run, it inserts in this table the period loaded, if there’s errors or not, the log folder path and the interface that run the job.

After that I do a FULL OUTER JOIN between this 2 tables to see all partitions I have and how many of these partitions were already executed. Next I PIVOT the information to get a table like data and the results is similar to this:

I created some simple Status code to make easy to manipulate later. NP is “No Partition”, N is “Never Run”, Y is “Warning”, R is “Error” and G is “Success”. Also, when is Y or R I have the Log Path associated with that run, this way the users can click and go to the log folder of that execution.

In my case this is important because this is for an restatement process where the business want to restate the entire past and we have millions of rows per partition, and they want flexibility to run as fit. Then we need to track the executions over time.

Now, the only thing that needs to be done is to convert this information in HTML code. This is easy since we just need to concatenate strings all over the place. Let’s see how I have done it:

The result is one big string for each row the query results. Each column was concatenated with a “Enter” between than, so when this code is used, we’ll have proper indentation for readability. This is the query I used to concatenate everything:

SELECT  '</TR>'||'
        '||'<TR ALIGN="CENTER">'||'
        '||'<TD>'||SESSION_NAME||'</TD>'||'
        '||'<TD>'||INTERFACE_NAME||'</TD>'||'
        '||'<TD>'||YEAR||'</TD>'||'
        '||'<TD bgcolor='||DECODE(REGEXP_SUBSTR(FEB, '[^|]+', 1, 1), 'R', '"#FC6456">', 'Y', '"#FCDB56">', 'G', '"#56FC6C">', 'NP', '"#FFFFFF">', 'N','"#8ABDFF">')||
        DECODE(REGEXP_SUBSTR(FEB, '[^|]+', 1, 1),'Y','<a href="file:///'||REGEXP_SUBSTR(FEB, '[^|]+', 1, 2)||'">Log</a>','R','<a href="file:///'||REGEXP_SUBSTR(FEB, '[^|]+', 1, 2)||'">Log</a>')||'</TD>'||'
        '||'<TD bgcolor='||DECODE(REGEXP_SUBSTR(MAR, '[^|]+', 1, 1), 'R', '"#FC6456">', 'Y', '"#FCDB56">', 'G', '"#56FC6C">', 'NP', '"#FFFFFF">', 'N','"#8ABDFF">')||
        DECODE(REGEXP_SUBSTR(MAR, '[^|]+', 1, 1),'Y','<a href="file:///'||REGEXP_SUBSTR(MAR, '[^|]+', 1, 2)||'">Log</a>','R','<a href="file:///'||REGEXP_SUBSTR(MAR, '[^|]+', 1, 2)||'">Log</a>')||'</TD>'||'
        '||'<TD bgcolor='||DECODE(REGEXP_SUBSTR(APR, '[^|]+', 1, 1), 'R', '"#FC6456">', 'Y', '"#FCDB56">', 'G', '"#56FC6C">', 'NP', '"#FFFFFF">', 'N','"#8ABDFF">')||
        DECODE(REGEXP_SUBSTR(APR, '[^|]+', 1, 1),'Y','<a href="file:///'||REGEXP_SUBSTR(APR, '[^|]+', 1, 2)||'">Log</a>','R','<a href="file:///'||REGEXP_SUBSTR(APR, '[^|]+', 1, 2)||'">Log</a>')||'</TD>'||'
        '||'<TD bgcolor='||DECODE(REGEXP_SUBSTR(MAY, '[^|]+', 1, 1), 'R', '"#FC6456">', 'Y', '"#FCDB56">', 'G', '"#56FC6C">', 'NP', '"#FFFFFF">', 'N','"#8ABDFF">')||
        DECODE(REGEXP_SUBSTR(MAY, '[^|]+', 1, 1),'Y','<a href="file:///'||REGEXP_SUBSTR(MAY, '[^|]+', 1, 2)||'">Log</a>','R','<a href="file:///'||REGEXP_SUBSTR(MAY, '[^|]+', 1, 2)||'">Log</a>')||'</TD>'||'
        '||'<TD bgcolor='||DECODE(REGEXP_SUBSTR(JUN, '[^|]+', 1, 1), 'R', '"#FC6456">', 'Y', '"#FCDB56">', 'G', '"#56FC6C">', 'NP', '"#FFFFFF">', 'N','"#8ABDFF">')||
        DECODE(REGEXP_SUBSTR(JUN, '[^|]+', 1, 1),'Y','<a href="file:///'||REGEXP_SUBSTR(JUN, '[^|]+', 1, 2)||'">Log</a>','R','<a href="file:///'||REGEXP_SUBSTR(JUN, '[^|]+', 1, 2)||'">Log</a>')||'</TD>'||'
        '||'<TD bgcolor='||DECODE(REGEXP_SUBSTR(JUL, '[^|]+', 1, 1), 'R', '"#FC6456">', 'Y', '"#FCDB56">', 'G', '"#56FC6C">', 'NP', '"#FFFFFF">', 'N','"#8ABDFF">')||
        DECODE(REGEXP_SUBSTR(JUL, '[^|]+', 1, 1),'Y','<a href="file:///'||REGEXP_SUBSTR(JUL, '[^|]+', 1, 2)||'">Log</a>','R','<a href="file:///'||REGEXP_SUBSTR(JUL, '[^|]+', 1, 2)||'">Log</a>')||'</TD>'||'
        '||'<TD bgcolor='||DECODE(REGEXP_SUBSTR(AUG, '[^|]+', 1, 1), 'R', '"#FC6456">', 'Y', '"#FCDB56">', 'G', '"#56FC6C">', 'NP', '"#FFFFFF">', 'N','"#8ABDFF">')||
        DECODE(REGEXP_SUBSTR(AUG, '[^|]+', 1, 1),'Y','<a href="file:///'||REGEXP_SUBSTR(AUG, '[^|]+', 1, 2)||'">Log</a>','R','<a href="file:///'||REGEXP_SUBSTR(AUG, '[^|]+', 1, 2)||'">Log</a>')||'</TD>'||'
        '||'<TD bgcolor='||DECODE(REGEXP_SUBSTR(SEP, '[^|]+', 1, 1), 'R', '"#FC6456">', 'Y', '"#FCDB56">', 'G', '"#56FC6C">', 'NP', '"#FFFFFF">', 'N','"#8ABDFF">')||
        DECODE(REGEXP_SUBSTR(SEP, '[^|]+', 1, 1),'Y','<a href="file:///'||REGEXP_SUBSTR(SEP, '[^|]+', 1, 2)||'">Log</a>','R','<a href="file:///'||REGEXP_SUBSTR(SEP, '[^|]+', 1, 2)||'">Log</a>')||'</TD>'||'
        '||'<TD bgcolor='||DECODE(REGEXP_SUBSTR(OCT, '[^|]+', 1, 1), 'R', '"#FC6456">', 'Y', '"#FCDB56">', 'G', '"#56FC6C">', 'NP', '"#FFFFFF">', 'N','"#8ABDFF">')||
        DECODE(REGEXP_SUBSTR(OCT, '[^|]+', 1, 1),'Y','<a href="file:///'||REGEXP_SUBSTR(OCT, '[^|]+', 1, 2)||'">Log</a>','R','<a href="file:///'||REGEXP_SUBSTR(OCT, '[^|]+', 1, 2)||'">Log</a>')||'</TD>'||'
        '||'<TD bgcolor='||DECODE(REGEXP_SUBSTR(NOV, '[^|]+', 1, 1), 'R', '"#FC6456">', 'Y', '"#FCDB56">', 'G', '"#56FC6C">', 'NP', '"#FFFFFF">', 'N','"#8ABDFF">')||
        DECODE(REGEXP_SUBSTR(NOV, '[^|]+', 1, 1),'Y','<a href="file:///'||REGEXP_SUBSTR(NOV, '[^|]+', 1, 2)||'">Log</a>','R','<a href="file:///'||REGEXP_SUBSTR(NOV, '[^|]+', 1, 2)||'">Log</a>')||'</TD>'||'
        '||'<TD bgcolor='||DECODE(REGEXP_SUBSTR(DEC, '[^|]+', 1, 1), 'R', '"#FC6456">', 'Y', '"#FCDB56">', 'G', '"#56FC6C">', 'NP', '"#FFFFFF">', 'N','"#8ABDFF">')||
        DECODE(REGEXP_SUBSTR(DEC, '[^|]+', 1, 1),'Y','<a href="file:///'||REGEXP_SUBSTR(DEC, '[^|]+', 1, 2)||'">Log</a>','R','<a href="file:///'||REGEXP_SUBSTR(DEC, '[^|]+', 1, 2)||'">Log</a>')||'</TD>'||'
        '||'<TD bgcolor='||DECODE(REGEXP_SUBSTR(JAN, '[^|]+', 1, 1), 'R', '"#FC6456">', 'Y', '"#FCDB56">', 'G', '"#56FC6C">', 'NP', '"#FFFFFF">', 'N','"#8ABDFF">')||
        DECODE(REGEXP_SUBSTR(JAN, '[^|]+', 1, 1),'Y','<a href="file:///'||REGEXP_SUBSTR(JAN, '[^|]+', 1, 2)||'">Log</a>','R','<a href="file:///'||REGEXP_SUBSTR(JAN, '[^|]+', 1, 2)||'">Log</a>')||'</TD>'||'
        '||'<TD bgcolor='||DECODE(REGEXP_SUBSTR(ADJ, '[^|]+', 1, 1), 'R', '"#FC6456">', 'Y', '"#FCDB56">', 'G', '"#56FC6C">', 'NP', '"#FFFFFF">', 'N','"#8ABDFF">')||
        DECODE(REGEXP_SUBSTR(ADJ, '[^|]+', 1, 1),'Y','<a href="file:///'||REGEXP_SUBSTR(ADJ, '[^|]+', 1, 2)||'">Log</a>','R','<a href="file:///'||REGEXP_SUBSTR(ADJ, '[^|]+', 1, 2)||'">Log</a>')||'</TD>'||'
'||'</TR>' AS SCRIPT

Basically is a lot of DECODE’s to convert my STATUS code in colors and some REGEXP to split the STATUS from the Log path. That’s it for SQL. Now the only problem we have is that this is a very big string and the only way for us to store this is to use a PL/SQL because inside a PL/SQL a Varchar2 (32767 bytes) variable is bigger than inside SQL (4000 bytes).

We just need to create a simple PL/SQL to insert and concatenate all this rows into a CLOB that is a little big bigger (4 GB). To do that we just need to do something like this:

DECLARE
 
CURSOR C_HTML_TAG IS

	SQL HERE;
 
V_HTML_BODY CLOB;
 
BEGIN
 
    FOR DADOS IN C_HTML_TAG LOOP
      V_HTML_BODY := V_HTML_BODY || TO_CLOB(DADOS.SCRIPT);
    END LOOP;
              
    INSERT INTO FDM_ODI_RUN.TMP_HTML_BODY_DW (HTML_BODY) VALUES (V_HTML_BODY);
              
END;

That’s it, now for the easy part, use it in ODI. To do so we’ll have a command in the SOURCE querying the TMP_HTML_BODY table and then we’ll pass #SCRIPT info to our Jython target code:

import smtplib
 
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
 
mailFrom = "#FROM_MAIL"
mailSend = "#ADMIN_MAIL"
 
msg = MIMEMultipart()
msg['Subject'] = "Restatements Control (<%=odiRef.getContext("CTX_NAME")%>)"
msg['From'] = mailFrom
msg['To'] = mailSend
 
 
html = """\
<!DOCTYPE html>
<html>
  <head></head>
  <body>
  <TABLE style="float: left;margin-left:10px" BORDER="1"  WIDTH="80%" CELLPADDING="1" CELLSPACING="1">
   <TR>
     <TH COLSPAN="5" ALIGN="CENTER">Legend</TH>
   </TR>
    <TR>
        <TR ALIGN="CENTER">
        <TH WIDTH="20%" bgcolor="#FC6456">Error</TH>
        <TH WIDTH="20%" bgcolor="#FCDB56">Warning</TH>
        <TH WIDTH="20%" bgcolor="#56FC6C">Success</TH>
        <TH WIDTH="20%" bgcolor="#FFFFFF">No Existing Partition</TH>
        <TH WIDTH="20%" bgcolor="#8ABDFF">Not Loaded Yet</TH>
    </TR>
    </TABLE>
<TABLE style="float: left;margin-left:10px" BORDER="1" WIDTH="80%"  CELLPADDING="1" CELLSPACING="1">
   <TR>
     <TH COLSPAN="16" ALIGN="CENTER">Restatements Control - Group ID: #GROUP_ID</TH>
   </TR>
<TR>
      <TH>Session Name</TH>
      <TH>Interface Name</TH>
      <TH>Year</TH>
      <TH>Feb</TH>
      <TH>Mar</TH>
      <TH>Apr</TH>
      <TH>May</TH>
      <TH>Jun</TH>
      <TH>Jul</TH>
      <TH>Aug</TH>
      <TH>Sep</TH>
      <TH>Oct</TH>
      <TH>Nov</TH>
      <TH>Dec</TH>
      <TH>Jan</TH>
      <TH>Adj</TH>
   </TR>

   #SCRIPT

</TABLE>
  </body>
</html>
"""
 
part = MIMEText(html, 'html')
msg.attach(part)
 
s = smtplib.SMTP('#SMTP_SVR')
 
s.sendmail(mailFrom, mailSend.split(','), msg.as_string())
 
s.quit()

ODI will than replace the SCRIPT with all HTML code we created and will turn this into a generic HTML table and send it by email using Jython.

I hope you guys enjoy it and see you soon.

How to use Jython to send a dynamic HTML table email from ODI (part 1/2)

Posted in Uncategorized with tags , , , , , , on April 21, 2020 by RZGiampaoli

Hey guys how are you? Today I’ll talk a little bit how can we create a dynamic HTML table email from ODI using Jython.

First of all, let me give you a little bit of context. I had to build an ODI process to restate the past data in our DW. That means, the business wanted, to a certain point in time, to go back all the way to the first period we have in our DW and restate the data based in a map table that they provided.

That’s all right, the biggest problem is that this table is partitioned by Source System and Period, and the business wanted the process to be flexible enough to let them run 1 period and 1 source system at time or to run an range of period and ALL sources at time (and any combination of these 2).

Also all right, my problem now is how to provide the business with a reliable way to tell them what they already run, what is still pending, if we had an error in a period or if there’s some validation fall outs in a period. In other words, how to track the process during execution.

My answer to that, I decide to send a email with a table that shows the source system and years in the rows and the months in the columns, and based in a color code, I paint the cells based in the status of the execution.

This post will be about the Jython/HTML code we wrote and the next post will be about how to make it dynamic in ODI. Let’s start it with the Jython part:

import smtplib

from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText

mailFrom = "ODI Services <donotreply@ODI.com>"
mailSend = 'email@here.com'

msg = MIMEMultipart()
msg['Subject'] = "Subject here"
msg['From'] = mailFrom
msg['To'] = mailSend

html = 
"""\

HTML CODE HERE

"""

part = MIMEText(html, 'html')
msg.attach(part)
s = smtplib.SMTP('SMTP SERVER HERE')
s.sendmail(mailFrom, mailSend.split(','), msg.as_string())
s.quit()

This is everything you need to have in your procedure to send a HTML code by email. It’s a very simple code, basically we import “smtplib” lib and that will handle the email sending, after that we just need to inform the user, password and SMTP server and use the “sendmail” to send the email. Pretty straight forward.

Now, in the meddle of the code, we have the HTML part that needs to be included. In our case, it’ll be a table. To test the HTML code, you can google “HTML test runner” that it’ll bring a lot of places in the internet where you can run your HTML code and test to see the results. It’s pretty handy, and I’m using this one here.

To create a simple table in HTML we just need this code here:

This code is also fairly simple and basically we have:

  1. <TABLE> tag, where you define the margins, border size, width of the table, cell padding and cell spacing. There’s more options there but you can easily find in the HTML doc.
  2. <TR> tag, where you define the amount of columns using the COLSPAN property as well the alignment of the text there
  3. <TH> tag, where we define the cells of our table itself. There’re a lot of properties for this but I’m using juts a fix 20% width for each cell, just to size them the same (since I have 5 columns), the Color of the cells and the message I want to send.

This is my legend table that will come above my real table, but the configuration is the same in both cases. We’ll have one <TR> block for each line we want to have and as much <TH> lines we need for each cell we want to have. In the end my final table is like this:

As you can see, I send an email with all periods that needs to be restatement showing if the interface already ran, if that was a success, or it had warnings or errors (with the link straight to the error file, if it was not loaded yet and even if we don’t had the partition created for that period/source.

Now, as I said, we need one <TR> per line and, in this case, 16 <TH>, one per cell. As you can imagine, that’s a lot of code that needs to be write there. thanks god I’m using ODI to do that for me, and we’ll take a look on this in the next post.

Thank you guys and see you soon.

How to “Save As” an Essbase Outline in ODI using Java

Posted in Uncategorized with tags , , , on April 20, 2020 by RZGiampaoli

Hey guys how are you?

Today I’ll going to show you how to “Save As” an essbase outline using ODI and Java. I normally use Maxl and OS commands to do this kind of things but turns out, there’s no Maxl to do that.

In fact, this is very interesting even if you don’t like java, because it’ll show exactly what Essbase does behind the scenes to save a outline. Let’s start.

First of all we’ll going to need some essbase API libraries. In ODI, the Client and the Agent already include some Essbase Jars in the Lib folder (one Lib folder for the Client and one for the Agent).

If you need anything outside what you have there you need to copy from essbase to the Lib folders and restart the agent. In this case we’ll need to import these:

import com.essbase.api.base.EssException;
import com.essbase.api.datasource.IEssCube;
import com.essbase.api.datasource.IEssOlapFileObject;
import com.essbase.api.datasource.IEssOlapServer;
import com.essbase.api.domain.IEssDomain;
import com.essbase.api.metadata.IEssCubeOutline;
import com.essbase.api.session.IEssbase;

After that we need to create a few String variables to help us organize our code:

String s_userName = <%=odiRef.getInfo("SRC_USER_NAME")%>;
String s_password = <%=odiRef.getInfo("SRC_PASS")%>;
String olapSvrName = <%=odiRef.getInfo("SRC_DSERV_NAME")%>;
String s_provider = "Embedded";
String appNameFrom = "Juno_M";
String appNameTo = "Juno";
String database = "Analysis";
IEssbase ess = null;
IEssOlapServer olapSvr = null;

Since I’m using an ODI procedure, I can set in the Command on Source tab the Essbase connection I want and then I can get in the Command on Target the User name, password and the server name as well, using the ODI substitution API, this way I can use what is store in the Topology without to worry about hard-code any password in the code.

In the next step we need to connect in Essbase using:

ess = IEssbase.Home.create(IEssbase.JAPI_VERSION);

IEssDomain dom = ess.signOn(s_userName, s_password, false, null, s_provider);

olapSvr = dom.getOlapServer(olapSvrName);

olapSvr.connect();

Basically what this is doing is to instantiate and essbase server, connection in the domain using the Command on Source information and then connect into a specific Olap server. After this we are ready to start execute some commands. And now it gets interesting. This is exactly what essbase does behind the scenes:

  1. It Locks the Outline we want to copy:
    • olapSvr.lockOlapFileObject(appNameTo, database, IEssOlapFileObject.TYPE_OUTLINE, database);
  2. It does an OS File copy from the source app folder to the target app folder:
    • olapSvr.copyOlapFileObjectToServer(appNameTo, database, IEssOlapFileObject.TYPE_OUTLINE, database,”\\server\D$\path”+”\”+appNameFrom+”\”+database+”\”+database+”.otl”,true);
    • As you can see, the command ask for the name of the app you want to save the outline, the type of the object (that is OUTLINE), the folder path for the source Outline and the last parameter is a true or false to tell if we want to unlock the object or to lock. True is unlock
    • If you look into the target folder during this step, you’ll see that Essbase will copy the source .otl to the target folder as .otn
  3. Then we need to open the target outline using:
    • IEssCubeOutline otl = olapSvr.getApplication(appNameTo).getCube(database).getOutline();
    • otl.open();
  4. Last thing you need to do is to merge the .otn into the .otl files:
    • otl.restructureCube(IEssCube.EEssRestructureOption.KEEP_ALL_DATA);
    • otl.close();
    • We just need to ask for the cube to restructure and pass the option KEEP_ALL_DATA, after that we can close the outline

Interesting thing here is that if you get a outline, rename to .otn, put this file inside a folder and force the cube to restructure (via EAS), it’ll automatically merge the created .otn with the .otl.

Also, this is why oracle recommend to have double the size of the cube in memory, because when we do a restructure, we have 2 outlines open at same time, the .otn and the .otl.

Here’s the entire code:

import com.essbase.api.base.EssException;
import com.essbase.api.datasource.IEssCube;
import com.essbase.api.datasource.IEssOlapFileObject;
import com.essbase.api.datasource.IEssOlapServer;
import com.essbase.api.domain.IEssDomain;
import com.essbase.api.metadata.IEssCubeOutline;
import com.essbase.api.session.IEssbase;

String s_userName = <%=odiRef.getInfo("SRC_USER_NAME")%>;
String s_password = <%=odiRef.getInfo("SRC_PASS")%>;
String olapSvrName = <%=odiRef.getInfo("SRC_DSERV_NAME")%>;
String s_provider = "Embedded";
String appNameFrom = "Juno_M";
String appNameTo = "Juno";
String database = "Analysis";
IEssbase ess = null;
IEssOlapServer olapSvr = null;

try {

    ess = IEssbase.Home.create(IEssbase.JAPI_VERSION);
    IEssDomain dom = ess.signOn(s_userName, s_password, false, null, s_provider);
    olapSvr = dom.getOlapServer(olapSvrName);
		olapSvr.connect();
		olapSvr.lockOlapFileObject(appNameTo, database, IEssOlapFileObject.TYPE_OUTLINE, database);
		olapSvr.copyOlapFileObjectToServer(appNameTo, database, IEssOlapFileObject.TYPE_OUTLINE, database,											"#ESSBEXTRACT_FOLDER"+"\\"+appNameFrom+"\\"+database+"\\"+database+".otl",true);
		IEssCubeOutline otl = olapSvr.getApplication(appNameTo).getCube(database).getOutline();
		otl.open();
		otl.restructureCube(IEssCube.EEssRestructureOption.KEEP_ALL_DATA);
		otl.close();

} 
	catch (EssException e) 
{
	System.err.println("Error: " + e.getMessage());
	throw new Exception("Error: " + e.getMessage());
} 
	finally 
{
	/* clean up the connection to essbase */
	if (olapSvr != null && olapSvr.isConnected() == true)
		olapSvr.disconnect();
	if (ess != null && ess.isSignedOn() == true)
		ess.signOff();
}

I hope you guys enjoy this one and see you soon.

Fragmented and Aggregated tables in OBIEE using ODI Part 4/5: Populating the Aggregated tables

Posted in Hacking, OBIEE, ODI, ODI Architecture, Oracle, Oracle Database, Tips and Tricks, Uncategorized with tags , , , , on February 12, 2020 by RZGiampaoli

Hey guys, how are you?

Continuing the series Fragmented and Aggregated tables in OBIEE using ODI and today we are talking about how to Populating the Aggregated tables using ODI.

Just to make easier for you to navigate in this series, here’s the parts of it:

Creating the tables: I’ll explain the advantages and how you can create the fragmented tables and also how to further improve performance using partitioning and sub partitioning.

Managing the partitions of the Fragmented/Aggregated tables: Here you’ll see how to manage all partitions using ODI.

Populating the Fragmented tables using ODI: Here you’ll find how to create generic components to load all fragmented tables we need to.

Populating the Aggregated tables using ODI: Here you’ll find how to create generic components to load all Aggregated tables we need to.

Setting the OBIEE Repository: Finally, we’ll going to setting up the OBIEE repository to make use of all tables.

Today we are in the final step before we can work in our OBIEE repository to put all these 18 fact tables together. The data load for our Aggregated tables.

The aggregation in fact is a very simple process, we just need to remove all detailed columns we have in the detail fact table and leave just the ID’s columns. After that we just need to reduce the level of the ID of the column we want to aggregate and sum all the data at that new level.

In our case we going to use the PERIOD_ID to do that, because period is the most common choice when we talk about aggregated table and serve well in most of the cases.

And that’s why I design the PERIOD_ID as YYYYQMMDD, because is very simple and easy to manipulate this number to go up or down a period, as well to do range or even transform it back to date. It’s way easier than create a surrogate key or whatever, even if you want to work with SCD.

As you probably already guest by now, we’ll use the command on source and command on target again to do the aggregations, this way we can have only one code to spread the data through out aggregate fact tables.

In the command on source for the monthly level table, we just need a query in the source that return the name of the detailed table plus the name of the monthly table. Since I designed all tables with a specific name pattern, we can easily manipulate the table name to get the month table from the detail table like this:

We don’t need anything fancy here because in the last post we create a proc to call six time the same scenario passing different parameter to it.

One of these parameter is the name of the fact table it needs to be loaded and this information is store in the variable #JAT_TABLE_NAME (already replaced by the value in the picture) and what I have done there is just split the table name using REGEXP to get the forth and the fifth occurrences of ‘_’ and concatenate everything back adding a _M_ in the middle of it, creating my Monthly level fact table name.

With data we have the detail table name that the scenario needs to load and also the monthly level fact table name that we need to use for that loop. We just need to create a query to aggregate the data, what’s very straight forward.

As the query will change depending of the design of the table, this one is just intended for explain what needs to be done since the idea is very simple. First we replace the name of the table that  we wish to insert data and the table that we wish to get the data from for our to variables: #JAT_TABLE_NAME that we are sending when we call this scenario and #JAT_M_TABLE_NAME that we just create in the Command on Source tab.

With this, for each loop, the scenario will get the data from the right source table and will insert in the right aggregated table. Now we need to aggregate the data, also a very simple matter. All we need to do is to join the detailed fact table with the period dimension and, because this is a range partition, we need to get the first date of that month and the last date of that month, that’s why we have that MIN and MAX filtered by the Year and Month.

With the right data filtered, we just need to aggregate the data and use the FISCAL_MONTH_ID instead of the PERIOD_ID, this way the data will be aggregated by month. And we are done here.

By the way, we could instead of using the between to get the right range of the partition filtered the partition itself using explicit partition filtering:

This will make oracle to go straight to that partition. The only thing is that you need to pass the partition name, that in our case is very straight forward, specially because we are creating and managing the partitions ourselves. We could have oracle create the partitions automatically using INTERVAL (for another post) but if we do that oracle will also create the names of partitions like SYS###### and that will make everything harder to filter by partitions. Anyway, this is just another option we can have.

For the quarter level it’s even easier because we don’t need to worry about range partitions anymore. We just need to have our Command on Source return the Monthly level table name and our Quarterly level table name:

As you can see, the query is the same, the only difference is that we insert a ‘_Q_’ in the middle of our string. And for the insert in the Target Command tab we just need, as before, replace the tables using the right variable in each case, join with the DIM_PERIOD_MONTH to have the right level of data, filter the Monthly level table using any method and then use the QUARTER_ID to sum the Monthly level data to the Quarterly level data and that’s it.

That’s all we need to do to populate all aggregated table. And we finally have all data populated in our tables and now we can start to create the OBIEE repository. It’s wort to mentioning that if you resume everything that I said until this point, we basically had to:

  • Create our tables;
  • Create one procedure to:
    • Create the partitions;
    • Truncate the partitions before we load data;
    • Call and loop the scenario that will load data;
    • Drop the old partitions
  • Create another procedure to:
    • Load detail data;
    • Load Monthly level data;
    • Load Quarterly level data;

What this means is that with just two procedure we can maintain and populate a very complex environment in a very clean way with very little points of failures.

This is a very elegant approach and as I always said, if is too complex and/or difficult, you were doing something wrong. It is just a meter of think about the design first, all that needs to be done, and split it in a logical and reusable way. If you do that, everything will be simple to be archived using ODI.

I hope this series has been useful so far and I see you in the last chapter.

ORACLE SQL for EPM tips and tricks S01EP01!

Posted in DEVEPM, ETL, Oracle, Oracle Database, Performance, SQL, Tips and Tricks, Uncategorized, WITH Clause with tags , , , , , , on January 21, 2019 by RZGiampaoli

Hey guys how are you? I decide to start a new series called ORACLE SQL for EPM tips and tricks. The idea here is to show the most useful SQL commands for EPM, how to improve performance, tips, tricks and everything that can be useful from a SQL point of view!

And to start well, I’ll show something very old but very useful that I don’t see too many people using these days. “WITH” clause.

I love to use “WITH” in my code. It helps organize the code, helps to optimize it and more important, to make it more efficient.

When you use “WITH” Oracle treats your query inside it as an inline view or resolved as a temporary table, making it easier and faster for Oracle to access that data if you need it again.

Simply putting, every time you needs to right a query that uses the same table over and over, it’ll probably be way more efficient if you use “WITH”.

The “WITH”clause works a little bit different from a regular SQL. We can say that we split the query in 2, one is the “WITH” declaration (That will behave like a table) and the other is the SQL that will Query the “WITH”.

WITH name_of_temp_table_here AS
(
    YOUR QUERY HERE
),
   name_of_temp_table_here2 AS
(
   SELECT *
   FROM name_of_temp_table_here, another_table...
)
SELECT *
FROM name_of_temp_table_here, name_of_temp_table_here2 

In the “WITH” you can have any kind of query you want. You can do joins, group by, you can also have more than one “WITH”, you can use the result of one “WITH” in the next “WITH”, you can do a lot of things.

But for now, lets take a look in a more real example. Let’s say that you have a table like I do, that contains all metadata from all yours applications:

Let’s say you want to get the Parent of a attribute that is associated with your Entity dimension. You probably will be doing something like this:

In the “FROM” we call the table 2 times and we join and filter everything we need. Since we don’t have attribute association in all levels we do a “Left Join” to make sure all Entities comes in the query. If we run a Explain Plan now we’ll get something like this:

As you can see, Oracle is querying the METADATA_EXTRACT table twice and each time it’s doing a FULL in one Partition (‘ENTITY’ and ‘PHYSICAL_GEOGRAPHY’ partitions).

Now, if we change the query (and we can do it in different ways, this is just one of them) to a “WITH” clause we ‘ll have something like this:

As you can see, we achieved the same results with the code a little bit different. Now I have all my filters in the “WITH” query and in the bottom I just call the “WITH” query 2 times and do what needs to be done.

If we run a Explain Plain now we will have:

As you can see, Oracle now is querying the METADATA_EXTRACT table just once and then his queries the SYS.SYS TEMP table twice. The only problem with this query and the way I did is that since we are creating a temporary table filtering 2 partitions and then later I’m filtering again, it’s basically doing 2 FULL scan in the same TEMP table, and even so, it’s a few seconds faster then the original query.

But this is just an example on how we can reduce the amount of times that Oracle needs to query a table. WITH is not a miracle clause or anything like that, is just another tool that we can use to optimize our code, and its performance needs to be evaluated in a case-by-case basis.

And even if the performance doesn’t change, I believe using “WITH” clause makes any query easier to ready, to test, to update and to right since you can divide your huge query in small bits and then join
everything in the bottom query.

“WITH” is a huge subject and we’ll be talking more about it in the next post, and this time we’ll be improving performance as well using “WITH” with “CONNECT BY”.