Hey guys how are you? Continuing our SQL series (S01EP13), today I’ll share a very hand little query that I use very often for check data duplication. In fact, this would be an upgrade version of ODI’s pk check.
An upgrade version because in ODI, if you enable PK check, if he finds duplication, he eliminate both data. This code I’ll show you, you would choose if you want to keep the last created duplication or the oldest one, but only one will be eliminated.
I have a test table with this values:
If I want to check for duplicate PK, I can just run this query here:
The Idea here is, we have 2 queries. The first one will check if the ROWID it has is bigger or smaller (you choice) than the MIN or MAX ROWID (depending of your previous choice) than the second sub query by any joins you want to check.
In this case, we wanted to check only if the PK column had duplicated values, but we could check any other column by just replace it in the join. In fact, we could have any amount of columns in the join and that would check if there’s any duplications in all columns you inserted there.
Then you can select the first using > and MIN or the last by using < and MAX as well you can select what column you want to check in the where clause.
One important thing to mention is that this query is meant to work as a delete because it’ll keep what was not in the select. What I mean is, if you have more than one duplication, it’ll bring, in this case, all the rows that has the ROWID > then the one selected in the first query:
Then if I have multiple duplications, the query will return everything that needs to be deleted and the only one remaining was the first one inserted (3, Chuck, Giampaoli).
I hope you enjoy this little trick and see you soon.
Hey guys how are you doing? Today I’ll continue with the Oracle always free cloud offering and we’ll finally start to provisioning a VM in our environment. If you want to know more about how it works (Part 1) or the overview about the Dashboard (Part 2) please check my previous posts.
The first thing we need to do is check for the best practices and see if everything in our environment is adequate.
IP Addresses Reserved for Use by Oracle:
Certain IP addresses are reserved for Oracle Cloud Infrastructure use and may not be used in your address numbering scheme (169.254.0.0/16).
These addresses are used for iSCSI connections to the boot and block volumes, instance metadata, and other services.
Three IP Addresses in Each Subnet
The first IP address in the CIDR (the network address)
The last IP address in the CIDR (the broadcast address)
The first host address in the CIDR (the subnet default gateway address)
For example, in a subnet with CIDR 192.168.0.0/24, these addresses are reserved:
192.168.0.0 (the network address)
192.168.0.255 (the broadcast address)
192.168.0.1 (the subnet default gateway address)
The remaining addresses in the CIDR (192.168.0.2 to 192.168.0.254) are available for use.
Essential Firewall Rules
All Oracle-provided images include rules that allow only “root” on Linux instances or “Administrators” on Windows Server instances to make outgoing connections to the iSCSI network endpoints (169.254.0.2:3260, 169.254.2.0/24:3260) that serve the instance’s boot and block volumes.
Oracle recommends that you do not reconfigure the firewall on your instance to remove these rules. Removing these rules allows non-root users or non-administrators to access the instance’s boot disk volume.
Oracle recommends that you do not create custom images without these rules unless you understand the security risks.
Running Uncomplicated Firewall (UFW) on Ubuntu images might cause issues with these rules. Because of this, Oracle recommends that you do not enable UFW on your instances.
System Resilience
Oracle Cloud Infrastructure runs on Oracle’s high-quality Sun servers. However, any hardware can experience a failure:
Design your system with redundant compute nodes in different availability domains to support failover capability.
Create a custom image of your system drive each time you change the image.
Back up your data drives, or sync to spare drives, regularly.
If you experience a hardware failure and have followed these practices, you can terminate the failed instance, launch your custom image to create a new instance, and then apply the backup data.
Uninterrupted Access to the Instance
Make sure to keep the DHCP client running so you can always access the instance. If you stop the DHCP client manually or disable NetworkManager (which stops the DHCP client on Linux instances), the instance can’t renew its DHCP lease and will become inaccessible when the lease expires (typically within 24 hours). Do not disable NetworkManager unless you use another method to ensure renewal of the lease.
Stopping the DHCP client might remove the host route table when the lease expires. Also, loss of network connectivity to your iSCSI connections might result in loss of the boot drive.
User Access
If you created your instance using an Oracle-provided Linux image, you can use SSH to access your instance from a remote host as the opc user. After logging in, you can add users on your instance.
If you created your instance using an Oracle-provided Windows image, you can access your instance using a Remote Desktop client as the opc user. After logging in, you can add users on your instance.
NTP Service
Oracle Cloud Infrastructure offers a fully managed, secure, and highly available NTP service that you can use to set the date and time of your Compute and Database instances from within your virtual cloud network (VCN).
We recommend that you configure your instances to use the Oracle Cloud Infrastructure NTP service.
Fault Domains
A fault domain is a grouping of hardware and infrastructure that is distinct from other fault domains in the same availability domain. Each availability domain has three fault domains. By properly leveraging fault domains you can increase the availability of applications running on Oracle Cloud Infrastructure.
Your application’s architecture will determine whether you should separate or group instances using fault domains.
Customer-Managed Virtual Machine (VM) Maintenance
When an underlying infrastructure component needs to undergo maintenance, you are notified before the impact to your VM instances. You can control how and when your applications experience maintenance downtime by proactively rebooting (or stopping and starting) your instances at any time before the scheduled maintenance event.
A maintenance reboot is different from a normal reboot. When you reboot an instance for maintenance, the instance is stopped on the physical VM host that needs maintenance, and then restarted on a healthy VM host.
If you choose not to reboot before the scheduled time, then Oracle Cloud Infrastructure will reboot and migrate your instances before proceeding with the planned infrastructure maintenance.
When you work with Oracle Cloud Infrastructure, one of the first steps is to set up a virtual cloud network (VCN) for your cloud resources. I was thinking to do a more detail explanation here but this topic is very big. Then I decide to try do a simple step by step in how to set you Network for you to access your resources from your computer.
This is not the best way to create an complex network or anything like that, is just a way to quick start using your always free components and test your VM and DB.
To start we will click in the “Setup a network with wizard” quick link:
After you click there you have 2 options:
Select VCN with Internet Connectivity, and then click Start VNC Wizard. In the next page, just insert the name of your VCN and leave averything else as it is (unless you have a reason to change). Click Next.
In the next page, it’ll show everything that will be create by the Wizard. Note that you can create manually piece by piece of it, but for simplicity, the wizard should be enough.”Click in Create.
Next screen will show the installation of what was requested:
And that’s it for the network. Now we can start to create our databases and VM’s all inside our network, and they all going to “see” each-other.
That’s it for the network. Again, this is a very simple way to set your Network and every single step above can be setup individually with greater complexity but I’m for sure, but that will be impossible to be done in the always free since a lot of the complexity stuff needs to be paid for.
You can get a lot more information in the Jumpstart your Cloud Skills on the Start Explore. There are a lot of videos there explaining a lot of things. For simplicity, I’ll post here all links available there just for people that wants to see the videos before they subscribe to the OCI.
Next thing we can do is create a load balancing. To do that, we just have to click in the Create Load Balancer in the Quick Actions and then fill the new page like this:
The most important thing here is to make sure you selected the Micro in the Bandwidth selection. This one is free (you can also see the Always free Eligible logo there. Click Next after this.
In the next page we need to choose the load balance policy, and for that, depending of your application you’ll select one specific one. We have 3 options:
Weighted Round Robin: This one distribute the load sequentially in the servers (one each)
IP Hash: This one guarantee that the request from one specific client always go to the same server
Least Connections: this one always select the server with less connections
Next you need to add Back-ends. We don’t have any create now, but we can add this later. And finally we can change the Health Check policy, but for what we are doing we can just leave as it is. Click Next. In this screen we have to create a listener:
Here we have 3 options of traffic listener, HTTPS, HTTP and TCP. I’ll going to select TCP without SSL for simplicity, but if you select HTTPS you’ll need to have SSL certificate files and private keys. It’s safer but if you want just o play around its better to select HTTP or TCP.
For TCP we just have this options:
If you select USE SSL you also need to provide the Digital Certificate and private keys.
After you select yours, just finish the process. You’ll be taking to the Load Balance Monitoring page where’ll see something like this:
And that’s it for the network. Next time we’ll provisioning a VM and we’ll set our machine to connect into the VM.
Hey guys how are you? Today I’ll continue to talk about the Oracle Always free cloud offering and I’ll try to summarize what you can do after your account is set up. If you want to know how to setup you account you can find it HERE.
After you receive an email saying everything is set you can login in your account and you’ll see a screen like this:
This is the main dashboard. Here’s where you’ll create your Database, your VM’s, convert your account to paid, manage your account, ask for help, etc… Let’s start with the main dashboard:
(1) Quick Actions: Here you’ll find the most important links as quick actions.
(2)Compute: This is where you can create a VM to be used with your databases. You can use it to install tools and develop whatever you want inside the your environment.
(3)Networking: Here’s where you set up your cloud network. This is the first step you must do to ensure your VM and databases will be in the same network and reaching each other.
(4)Autonomous Transaction Processing: This is where you create a transaction database.
(5)Autonomous Data Warehouse: This is where you can create your Data Warehouse database.
(6)Search: A quick way to view all your resources.
(7)Account Center: Here’s a quick place to manage your account and see how many credits you have and billing information
(8)Main Menu: This is the main menu where you have access to everything that you can do inside your Cloud.
(9)Top Bar: Where you can change regions, in case you have more than one region, access the Cloud Shell (for OS commands), see the help, ask for help in the chat, change language and see your profile.
(10)Start Exploring: Here’s a place where you can find articles to help you start setting up your environment.
(11)What’s new: And finally here’s where you can see news about Oracle cloud, like releases and things that will be added.
One important thing to add here is that before you add anything or create anything, look for the “Always Free Eligible” logo or description to be sure you’ll not buying anything by mistake. Now about the main menu:
Core Infrastructure: Here’s where you can set your VM’s, networks and storage options.
Database: Here’s where you can Set your databases options, backups and Servers (VM or Bare metal).
Data and AI: Here’s where you can set your Big Data and AI environment.
Solution and Platform: Here’s where you can set your Analytics cloud services, Integrations, monitoring and marketplace.
More Oracle Cloud Services: Here’s where you have other cloud services.
Governance and Administration: And here is where you can administrate your environment like provisioning security, Account Management, Identity and Governance.
As you can see there’s a lot that can be done, but we’ll concentrate in the “Always Free” content, but the following list summarizes the Oracle Cloud Always Free-eligible resources that you can provision in your tenancy:
Compute (up to two instances)
Autonomous Database (up to two database instances)
Load Balancing (one load balancer)
Block Volume (up to 100 GB total storage)
Object Storage (up to 20 GiB)
Vault (up to 20 keys and up to 150 secrets)
In the next post we’ll setup our environment. See you soon guys.
I decide to do some posts about Oracle Always free offering, how it works, how you setup things, a few things we can do with that and maybe more. I think is fair for us to start by what’s it and what you need to do to get one.
Basically Always Free is a services for anyone that wants to try the world’s first self-driving database and Oracle Cloud Infrastructure for an unlimited time. The ideas is let people explore the full functionality of Oracle Autonomous Database and Oracle Cloud Infrastructure, including Compute VMs, Block and Object Storage, and Load Balancer, all of the essentials for developers to build complete applications on Oracle Cloud.
Oracle’s Free Tier program has two components:
Always Free services, which provide access to Oracle Cloud services for an unlimited time
Free Trial, which provides $300 in credits for 30 days to try additional services and larger shapes
The new Always Free program includes the essentials users need to build and test applications in the cloud: Oracle Autonomous Database, Compute VMs, Block Volumes, Object and Archive Storage, and Load Balancer. Specifications include:
2 Autonomous Databases (Autonomous Data Warehouse or Autonomous Transaction Processing), each with 1 OCPU and 20 GB storage
2 Compute VMs, each with 1/8 OCPU and 1 GB memory
2 Block Volumes, 100 GB total, with up to 5 free backups
10 GB Object Storage, 10 GB Archive Storage, and 50,000/month API requests
1 Load Balancer, 10 Mbps bandwidth
10 TB/month Outbound Data Transfer
500 million ingestion Datapoints and 1 billion Datapoints for Monitoring Service
1 million Notification delivery options per month and 1000 emails per month
Well, if you ask me this is far better than install an Oracle XE in your machine and configure everything there for you to learn or to create some small app. In fact, if you want to learn, it’ll far better if you start learning in an cloud environment since everyday we have more and more companies migrating to cloud.
Ok, but what do you need to do to get one? In fact is very easy, you just need to access this link and click in the Start for Free button. After that you have to fill a short form where you need to inform:
Your email and user information like address and cellphone
You need to validate your cellphone through message (Oracle will send a code to your cell)
You need to choose the region you’ll going to have you OCI (Oracle Cloud Infrastructure)
This needs to be as close as possible as your real region to decrease latency and improve network performance
Some regions are not available for always free (it’s written next to the region name if is available or not)
And you need to add a credit card to your account
You’ll not be charged but you may see 1 Dollar/Euro/… getting charged but it’ll be return
Also, Revolut card don’t work, you need a proper credit card.
And that’s it, Oracle will create your account (in fact takes around 15 minutes until you receive a email with further instructions [Bare in mind that because the COVID-19, it’s taking several days to create a new account]). After you receive your email, you can login in your dashboard and start to create your network, Disk, Database, VM’s and more.
We’ll see how to configure a database in my next post. I hope you enjoy this and see you soon.
Hey guys how are you? I hope you guys are not insane after this 2 months of quarantine. Anyway, is time for us to finish the send email job. In the previous post HERE I explained the Jython code and the HTML code that we need to use to create our HTML table in our email. Today we’ll going to do it become dynamic.
As we saw, for every row we want it we need to have a block of HTML code that will draw the table, color the table and write the content of the cell in our table. We need this to change dynamically if we want to be useful for us, and to do that we need to write a SQL code to create this HTML code for us.
In my case I generate this code here to be my header:
This is saying that I’ll have 16 columns (COLSPAN=”16″) with Center alignment and the name of the table will be “Restatements Control – Group ID: 1” (where the 1 will be dynamically generated as well).
Now we first need to write a query to get this info for us. Since this is a very project related query, I don’t think it’ll do any good for you guys to put my query here, but I’ll explain what I was looking for. First I’m querying the ALL_TAB_PARTITIONS to get all partitions related with that table. Then I was querying a control table that every time the jobs run, it inserts in this table the period loaded, if there’s errors or not, the log folder path and the interface that run the job.
After that I do a FULL OUTER JOIN between this 2 tables to see all partitions I have and how many of these partitions were already executed. Next I PIVOT the information to get a table like data and the results is similar to this:
I created some simple Status code to make easy to manipulate later. NP is “No Partition”, N is “Never Run”, Y is “Warning”, R is “Error” and G is “Success”. Also, when is Y or R I have the Log Path associated with that run, this way the users can click and go to the log folder of that execution.
In my case this is important because this is for an restatement process where the business want to restate the entire past and we have millions of rows per partition, and they want flexibility to run as fit. Then we need to track the executions over time.
Now, the only thing that needs to be done is to convert this information in HTML code. This is easy since we just need to concatenate strings all over the place. Let’s see how I have done it:
The result is one big string for each row the query results. Each column was concatenated with a “Enter” between than, so when this code is used, we’ll have proper indentation for readability. This is the query I used to concatenate everything:
Basically is a lot of DECODE’s to convert my STATUS code in colors and some REGEXP to split the STATUS from the Log path. That’s it for SQL. Now the only problem we have is that this is a very big string and the only way for us to store this is to use a PL/SQL because inside a PL/SQL a Varchar2 (32767 bytes) variable is bigger than inside SQL (4000 bytes).
We just need to create a simple PL/SQL to insert and concatenate all this rows into a CLOB that is a little big bigger (4 GB). To do that we just need to do something like this:
DECLARE
CURSOR C_HTML_TAG IS
SQL HERE;
V_HTML_BODY CLOB;
BEGIN
FOR DADOS IN C_HTML_TAG LOOP
V_HTML_BODY := V_HTML_BODY || TO_CLOB(DADOS.SCRIPT);
END LOOP;
INSERT INTO FDM_ODI_RUN.TMP_HTML_BODY_DW (HTML_BODY) VALUES (V_HTML_BODY);
END;
That’s it, now for the easy part, use it in ODI. To do so we’ll have a command in the SOURCE querying the TMP_HTML_BODY table and then we’ll pass #SCRIPT info to our Jython target code:
Just a quick one today, Oracle is offering free access to online learning content and certifications for a broad array of users for Oracle Cloud Infrastructure and Oracle Autonomous Database, and will be available until May 15, 2020.
This is a great opportunity and if you want to learn more, you can find it here.
Hey guys how are you? Let’s take a look today in the opposite of S01EP12 situation, in fact we’ll use the same example again to show how can we convert a string in a list of values in a easy and dynamic way, starting with this query here:
I’ll transform this query in a with and I’ll use REGEXP to put this back into a list of values. This is very useful when we extract metadata from essbase for example, because essbase exports the UDA’s as a list of values. Of coarse this has many uses other than this but let’s keep this one in mind.
Now what we need to do is to split the strings by comma in this case, then the idea is to count the amount of commas we have in a row and split the strings by that amount.
The idea here is to use the REGEXP_COUNT to count how many words we have in between the commas and then use it to multiply the rows in the CONNECT BY LEVEL. For example, if we have 3 words, the connect by will create 3 rows of the same row, one with the LEVEL = 1 another with the LEVEL =2 and the last one with LEVEL=3.
With that we just need to use the REGEXP_SUBSTR to extract the words based in the LEVEL, this way we’ll have the REGEXP_SUBSTR(STR, ‘[^,]+’, 1, LEVEL (that will be 1 for the first row, 2 for the second and 3 for the third one).
Hey guys how are you keeping? I hope everybody is healthy and keep this way in this difficult times.
And to make our life less complicate, here’s another tip. Let’s talk about how to concatenate stuff in Oracle.
Imagine a simple case, we want to query the Planning repository to get the list of UDA’s a member have. We can easily do that by query the HSP_OBJECT, HSP_MEMBER_TO_UDA and HSP_UDA tables.
I’m filtering just 3 products to make it easier for us to see. The results shows that each project has a different number of UDA’s, and we never know how many it’ll be, then the easiest way to concatenate them is to use the command LISTAGG (or WM_CONCAT if you are in a DB version prior to 11.1).
The command is very simple LISTAGG(Column, Separator) WITHIN GROUP (ORDER BY column). As we can see the command allow us to select the separator we want (can be comma or any string really) as well to order the results by another column). Let’s take a look in the example above.
As you can see, it easily create a list split my comma (as specified) for me, and the nice thing about it is that I don’t need to do any string treatment if return null or if I have just one string on it and things like that.
This is an extremely good Function and we heavily use it in ODI to generate dynamic code because its simplicity, for example, we can generate a SQL statement on the fly using the command on source and command on target:
With this results we can easily pass this info to the command on target to generate a dynamic query where ODI will replace the columns we got in the target as well the table name and will also loop for each row we have in the source. This is very handy.
And for the ones that are not in the ORACLE 11.2 and ahead, we can still do that using WM_CONCAT. Is not as powerful as LISTAGG, but works pretty well. Let’s try the first example again:
I cannot show you the results since WM_CONCAT was decommissioned in the 12c (my version), but it’ll work like this. We don’t have the option to choose the separator and to make the string unique and to order by it we need to add DISTINCT in the command WM_CONCAT(DISTINCT column).
Today I’ll post something that is very simple but very useful specially when working with ODI.
When we work with partitioned table we know that if we filter that table by the partitioned column Oracle will use that partition as source of data. But what if we are doing an Insert, Update or Merge?
There’s another way to explicit refer to a partition and make sure Oracle will be working inside that one and is by defining it in the From clause.
For example if I want to query the Partition “DELL_BALANCES_FY20_FEB” I can query:
As we can see, after the table name I specified the PARTITION (DELL_BALANCES_FY20_FEB) and put inside the parentheses the partition name (don’t specify as string) and that makes oracle distinct all the rows in that partition, and my Distinct of the PARTITION_KEY shows only one results as expected. (this command needs to come before the table alias).
If we are doing an Insert, Update or Merge the idea is the same:
This way we can, specially in the MERGE, make sure Oracle will be working in the right partition in the target table.
And it’s specially useful with ODI because we always know the partition we want to query or insert data when we use ODI, then we can always bind Oracle to a specific partition and make sure he’ll stay there.
Today a quick tip that I think is very useful. From time to time the business ask us to validate if a table has data or not before we load it. It’s fare, specially if you use a truncate and insert approach.
The problem is, sometimes, the table/view they are asking for has millions of rows, and there’s no other safe way to validate if a table has data or not than querying it.
I just fixed a case where an interface had a validation that basically counts 3 different tables that together had 40 million rows per period. This validations were taking around 1000 sec to happens.
The data load that happens before that took 1200 sec. Then, basically the validation process were taking as much time as the load process.
After some changes, the query now is validating the 3 tables in 0.3 seconds. Way better than before. Basically I just used 3 things:
The hint /*+ FIRST_ROWS(1) */ that makes oracle prepare the best plan to query just one row (in my case since I used 1 as parameter.
The filter ROWNUM = 1 to make sure oracle just return 1 row, if we don’t use that, the hint can make everything very slow because oracle will be planning for just one row, but without filtering it’ll bring more (using the best plan possible for 1 row).
And UNION ALL instead of UNION, because there’s a huge difference between them. when you use UNION, oracle matches the sets of data to make sure you have unique rows after that. UNION ALL in other case, just bring everything each set return without any extra process to validate anything. UNION ALL is always faster than UNION.
In the end I have an query like this:
As you can see, the query is very simple and for this example I just had the name of the table there, then we know the table is not empty for that period. We can do other approach like summing then all together and validate if the results is = 3 for example or any other logic we need can be implemented on top of this query.
I hope this is helpful for you guys and see you in the next post.