Archive for April, 2023

ODI in the hybrid database world – Oracle ADW – OCI/CLI + DBMS_CLOUD

Posted in Autonomous Database, Cloud, ODI with tags , , on April 14, 2023 by Rodrigo Radtke de Souza

Written on April 14, 2023 by Rodrigo Radtke de Souza

Hi! In all the previous posts about ODI in the hybrid database world, I always showed two versions of data loads for each technology, one being JDBC and another with text files. In the case of Oracle ADW, I posted just the JDBC example. So, let me correct that today. Let’s see how you can also use text files to load data into Oracle ADW tables.

The process is very similar to what we have been doing for some time now. We will duplicate “IKM SQL to File Append” and add some extra steps in the end of it to push the file to the cloud and load the target table. In the ADW case, we will need a couple of things to make it work:

  • An Oracle Bucket, which is a container for storing objects in a compartment within an Object Storage namespace in the Oracle cloud. More information about it here.
  • Install OCI CLI, which ODI will call to push the file to the Oracle Bucket. The steps on how to install it can be found here.

Those two you will need to create and install before proceeding. There is a third component needed, but this one already exists in the Oracle ADW: DBMS_CLOUD package. You may read about it here.

This package will be used to push the data from Oracle Bucket to Oracle ADW table. There is a configuration that you may need to do regarding DBMS_CLOUD credentials, but I’ll talk about it later.

This is what the final IKM will look like:

First five steps are from the original KM. The last three are brand new. Let’s take a look first on GZIP:

Here we are just zipping the text file before sending it to the cloud. Just notice that Oracle accepts other kinds of compression, so you may change it if you want to. I used GZIP because it seems to be a standard that works with all Public cloud vendors out there.

Next one is OCI PUT:

Very straight forward step that gets the GZIP file and uses OCI CLI to PUT the file in an Oracle Bucket that belongs to an Oracle NameSpace. You may refer to OCI PUT documentation here.

Last one is dbms_cloud copy_data step:

COPY_DATA may have other options. A full list can be found here.

All options are self explanatory. In this case we are loading a text file with columns separated by pipes (“|”). One option that you need to notice is file_uri_list. It indicates the text file URL that resides in your Oracle bucket. It may look like this:

https://swiftobjectstorage.region.oraclecloud.com/v1/namespace-string/bucket/filename

or like this:

https://objectstorage.region.oraclecloud.com/n/namespace-string/b/bucket/o/filename

To make things simpler in this example, I set the bucket as public, so I could access it without credentials. However, in a production environment, that won’t be the case and you will need to create a credential using DBMS_CLOUD.create_credential. You will run this command only once in your Oracle ADW instance:

begin
  DBMS_CLOUD.create_credential (
	credential_name => 'OBJ_STORE_CRED',
	username => 'XXXXX',
	password => 'YYYYY'
  ) ;
end;

You can read more about how to create the credential here. After you create the credential, you will use it in the COPY_DATA command as a parameter (credential_name).

When we run the mapping, we get something like this:

Very similar to the other posts, we are generating a text file, GZIPing it, calling OCI CLI to send the file to the cloud and then calling DBMS_CLOOUD.COPY_DATA procedure to load the data from the Oracle bucket to our target table.

I hope you liked it! See you next time!

Advertisement

Snowflake loader using ODI

Posted in ODI, Snowflake with tags , on April 13, 2023 by Rodrigo Radtke de Souza

Written on April 13, 2023 by Rodrigo Radtke de Souza

Hi all. Today’s post will talk about how to create a simple, but powerful Snowflake loader using ODI. Actually, you may use it with any technology that you want, but I ended up creating this one for Snowflake because I needed to load several Oracle tables to Snowflake in a fast and dynamic manner.

This post combines the techniques described here and here. In the first post we saw how to load text files to Snowflake. In the second post we saw how to create dynamic text dumps from Oracle. What if we combine both posts? We could create a dynamic load that will receive a table name as a parameter, create a text file and push this text file to Snowflake. Isn’t that cool?

Our dynamic loader will be an ODI Procedure and it will have the following steps:

You can get the code for “Get File Metadata”, “Create File”, “Truncate File”, “Create Header” and “Insert Rows” from my past post. There is a new step here called “Create target table” that will try to create a table in Snowflake if it does not exists yet. I added this step just as a convenient way to speed up my development process, since this procedure is nothing more than a job to get some table from Oracle and copy that to Snowflake. The code is below:

CREATE TABLE <%=odiRef.getSchemaName( "D" )%>.<@=table_name@>
(
<@=table_column_type@>
);

Just don’t forget to set the target schema with the logical schema from Snowflake where you want the table to be created and also set “Ignore Errors” in the procedure, so if the table exists it does not stop the process.

The last three steps are almost the same as in the first post, with one change. Instead of <%=odiRef.getTargetTable(“TABLE_NAME”)%>, now I’m getting the information from <@=folder_path@> and <@=table_name@>. E.g:

OdiOSCommand "-OUT_FILE=<@=folder_path@>_copy.log" "-ERR_FILE=<@=folder_path@>_copy.err"
snowsql -c #P_CONNECTION_NAME -w #P_SNOW_WAREHOUSE -r #P_SNOW_ROLE -d #P_SNOW_DATABASE -s #P_SNOW_SCHEMA -q "copy into #P_SNOW_DATABASE.#P_SNOW_SCHEMA.<@=table_name@> from  @#P_STAGE_NAME/<@=table_name@>.gz"

When we execute the package, we have the following:

The job received a table name as a parameter. It gets the metadata from the table name in Oracle, creates a file with its data, gzips it, sends it to Snowflake using PUT and COPY it into the final table (that was created for you if it didn’t exist before). Awesome right? If you notice it, this process will duplicate the data if you run it twice, since it does not truncate the target table before loading the data into it. I did that on propose because there was scenarios that I wanted to load several different datasets into a single table, in append mode. But, you may enhance this procedure and add a step to truncate the target table if you wish.

Hope you enjoyed! See you next time!