Impact of Azure-Oracle OCI interconnect on a multi-cloud solution   Recently updated !

Introduction

Oracle and Microsoft announced in June 2019 a cloud interoperability partnership which enables workloads across Microsoft Azure and Oracle Cloud. By creating a first joint multi-cloud solution, the software giants can each continue to provide the best of their services. At the same time customers do not need to decide which vendor they opt-out when moving their on-premises constructs.

Being myself an Oracle Database Administrator, this article aims to check what is the impact of distributing resources in multiple clouds, with databases remaining on Oracle Cloud Infrastructure.

It is not aim of the article to discuss the costs of resources on any of the clouds.

Architecture

A quick overview of the architecture is shown below. On Microsoft side I built a VNet with two subnets. One VM with public IP was the bastion. On Oracle side, three subnets. One public subnet with bastion VM; one private with the application; another private with DBs. When creating the ExpressRoute, a new GatewaySubnet was automatically created on Azure.

Setup

Differences between Oracle Cloud and Azure

Azure portal is amazingly fast and comfortable to navigate. For instance, the last used elements are always shown and there is easy way to start and stop all instances in one go. There is no concept of private subnet in Azure. The public IP is an element per se that you can assign to a compute node at any moment. Spot VMs, which use unused resources at very discounted price, are excellent to perform tests. These resources can be stopped by Azure at any time with very short warning. The troubleshooting tools like ability to see “effective routes” or “effective policies” of a VM, help a lot.

On OCI, the VM Image available with Oracle Database comes directly with the latest RU and one database created. This is very useful for quick tests. The VCN wizard makes the basic creation of a virtual network very easy. Boot disks of a VM have a very good performance.

Interconnect

The lowest bandwidth on FastConnect configuration is 1Gbps. On Azure side, the ExpressRoute starts at 50 Mbps! It is quite incomprehensible this difference of standards when promoting a consolidated solution. The setup was initially done with this configuration. Later tests were also performed using a 100 Mbps ExpressRoute, but it did not change de results.

VMs shapes

On Azure, VMs of type D4as_v4 were used. On Oracle, shape VM.Standard2.2 was used. The table below compares the characteristics of these VMs.

Azure D4as_v4OCI VM.Standard2.2
CPU4 vCPUs2 oCPU
Memory16 GB30 GB
NIC2000 Mbps2 Gbps
OS Disk120 IOPS / 25 MB/s15000 IOPS / 150 MB/s
Extra disk (paid extra)5000 IOPS / 200 MB/s 

Oracle OS disk is 8x bigger (250 vs 30 GB) and much faster. As the OS disk at Azure was having impact on the results, I added there an extra 1TB disk, from where tests were run (and datafiles placed for DB tests).

VMs images

The most recent Azure VM image included Oracle Database software version 19.3. I did patch with 19.7 RU for the tests. Above that I manually create a database, using default values, except filesystem_options (setall) and processes (500). Memory was set to exact same values as on Oracle VMs (4.5G SGA + 1.7G PGA).

OCI image comes already with latest RU, a database created which seems parameter-wise optimized for the Oracle Cloud environment.

Tests

Ping tests were performed with 10 probe counts using the IP of destination VM. The results of several runs of tests were consistent. The values presented below are the ones having the smallest standard deviation.

Application-DB assessments were done using Swingbench 2.6 and its jdbc Order Entry test. This test is the one which does a greater number of calls through the network. To make sure the impact was not on CPU, disks or network bandwidth, the comparison is done between the results of tests with 4 and 8 clients. When using more clients, the waits started to be either on local disk or on the database side. Each of the tests run for couple of minutes and several tests were done on each of the configurations.

The first tests from Azure showed a very bad performance. This was seen to be related to the performance of the OS disk. I decided to add a second disk with higher performance and install Swingbench on the new partition.

The tests against Autonomous Database (ATP) were done using 1 and 2 oCPUs, but this did not change the results. The service used was <db>_tp.

Results

Only a summary of the main results is shown.

Ping tests

The latency between Oracle VMs is two times smaller than between Azure VMs.

Between the clouds, the latency is little big bigger than 2.5 milliseconds. This is 4 to 8 times more than intra-cloud pings.

minavgmaxmdev
Azure-Oracle2.392.512.610.09
Oracle-Azure2.532.652.750.1
Azure-Azure0.570.650.750.066
Oracle-Oracle0.260.290.340.02
Ping results (ms)

Swingbench tests

The latency plays a big role on lights Swingbench tests. We can see that also here, Inter-cloud architecture is at least 2.5 times slower than running on the same cloud. The results with Azure VMs are using the extra faster disk, except when mentioned.

4 threads8 threads
OCI VM – OCI VM440603
Azure VM – Azure VM (1)2438
Azure VM – Azure VM540949
OCI VM – OCI ATP263378
Azure VM – OCI ATP95191
transactions per second
Note: (1) using OS disks

Conclusions

It is with great satisfaction that I see Oracle and Microsoft cooperating.

There is clear a trade-off to consider when going for a multi-cloud solution. While the latency per se is very small, at around 2.5 ms, this value is 4 to 8 times larger than intra-cloud latency.

The swingbench tests also show the performance difference when we try to abstract from all other elements. But it is not possible to skip the fact that on each of the clouds VMs have different specifications as well as the resource management limits, mainly seen on the SSD performance, have an impact on the results.

I was surprised with the results of the ATP tests, where I expected better performance. We should not forget however that ATP brings many other advantages that are a big plus for several applications.

In all the cases, I still believe that for most of the applications the current latency between Oracle Cloud and Azure is good enough and for well behaving applications (forget fetch one row at the time!) it should not be a problem.


OCI / Azure and Oracle DB: firewall and security rules

Many of the problems with VMs on Clouds are related to firewall or security rules. Here some elements that might help.

These examples are based on Oracle Linux 7.7 virtual machines.

Firewall: check if it is running

When is loaded and running (active) is shows like:

[opc@alfama ~]$ sudo systemctl status firewalld 
● firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2020-05-15 05:19:12 GMT; 15min ago
Docs: man:firewalld(1)
Main PID: 1640 (firewalld)
Tasks: 2
Memory: 33.2M
CGroup: /system.slice/firewalld.service
└─1640 /usr/bin/python2 -Es /usr/sbin/firewalld --nofork --nopid

When it is loaded but stopped:

[opc@alfama ~]$ sudo systemctl status firewalld -l
● firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Fri 2020-05-15 05:36:09 GMT; 7s ago
Docs: man:firewalld(1)
Process: 1640 ExecStart=/usr/sbin/firewalld --nofork --nopid $FIREWALLD_ARGS (code=exited, status=0/SUCCESS)
Main PID: 1640 (code=exited, status=0/SUCCESS)

Firewall: check ports open

[opc@boavista ~]$ sudo firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: eth0
  sources:
  services: dhcpv6-client ssh
  ports: 1521/tcp
  protocols:
  masquerade: no
  forward-ports:
  source-ports:
  icmp-blocks:
  rich rules:

Firewall: open Oracle listener port

If we want to open the default listener port, two commands are needed. One for adding the rule and another to enable the new rule:

[opc@mouraria bin]$ sudo firewall-cmd --zone=public --permanent --add-port=1521/tcp
[opc@mouraria bin]$ sudo firewall-cmd --reload

Firewall: allow ping reponses

Ping uses a separate protocol. To allow the VMs to answer pings, this rule needs to be added to the firewall:

[opc@mouraria bin]$ sudo firewall-cmd --permanent --direct --add-rule ipv4 filter INPUT 0 -p icmp -s 0.0.0.0/0 -d 0.0.0.0/0 -j ACCEPT
[opc@mouraria bin]$ sudo systemctl restart firewalld.service

On OCI, to be able to ping between hosts, it might be necessary to add an Ingress rule that allows pings. The rule should look like below. The most important is the Protocol ICMP and type 8. The source CIDR should not be larger than the VCN CIDR.

Test if port is open:

To check if the remote port is open, I use the follow commands. This works both with IPs or hostnames and also lets you know if port is open, but listener is not running (on this port):

[opc@alfama ~]$ ### PORT UNREACHEABLE, SERVER DOWN/WRONG? ###
[opc@alfama ~]$ export CHECK_IP="10.1.2.4/1521"
[opc@alfama ~]$ timeout 1 bash -c '</dev/tcp/${CHECK_IP} &amp;&amp; echo Port ${CHECK_IP} is open || echo Port ${CHECK_IP} is closed' || echo Connection timeout
Connection timeout

[opc@alfama ~]$ ### PORT OPEN, USING HOSTNAME ###
[opc@alfama ~]$ export CHECK_IP="luz.subnetpriv2/1521"
[opc@alfama ~]$ timeout 1 bash -c '</dev/tcp/${CHECK_IP} &amp;&amp; echo Port ${CHECK_IP} is open || echo Port ${CHECK_IP} is closed' || echo Connection timeout
Port luz.subnetpriv2/1521 is open

[opc@alfama ~]$ ### PORT OPEN, USING IP ###
[opc@alfama ~]$ timeout 1 bash -c '</dev/tcp/${CHECK_IP} &amp;&amp; echo Port ${CHECK_IP} is open || echo Port ${CHECK_IP} is closed' || echo Connection timeout
Port 10.1.5.3/1521 is open

[opc@alfama ~]$ ### PORT CLOSED ###
[opc@alfama ~]$ export CHECK_IP="luz.subnetpriv2/1522"
[opc@alfama ~]$ timeout 1 bash -c '</dev/tcp/${CHECK_IP} &amp;&amp; echo Port ${CHECK_IP} is open || echo Port ${CHECK_IP} is closed' || echo Connection timeout
bash: connect: No route to host
bash: /dev/tcp/luz.subnetpriv2/1522: No route to host
Port luz.subnetpriv2/1522 is closed

[opc@alfama ~]$ ### PORT OPEN but LISTENER DOWN ###
[opc@alfama ~]$ timeout 1 bash -c '</dev/tcp/${CHECK_IP} &amp;&amp; echo Port ${CHECK_IP} is open || echo Port ${CHECK_IP} is closed' || echo Connection timeout
bash: connect: Connection refused
bash: /dev/tcp/10.1.5.3/1521: Connection refused
Port 10.1.5.3/1521 is closed

Bonus

At Oracle OCI, one can access other VMs of the VCN using the <hostname>.<subnet>

On Azure, the <hostname> is enough to access other host of the VNet.


SSH tips & tricks

These days I’ve been playing with Oracle Cloud and Azure. It is best practice to have a bastion or jumphost server with public IP address and all the rest on private networks, only accessible internally.

Also, there is no use of passwords and only the ssh public key of the user is located on the servers. Usually this is what we try to do:

To start with, I use MobaXterm to access my servers. There I’ve configured my SSH private keys to be loaded:

MobaXterm configuration menu for SSH

Connect to jumphost server and take SSH key

Use the -A option when connecting:

ssh -A opc@bastion-server

Then you can do ssh to the next server in the private subnet without password. Example of not using and using this option:

Connect to private server via jumphost directly

For this we use both -A and -J options:

ssh -A -J opc@bastion-server opc@private-server

SCP directly to private server via JumpHost

Here we have to use a ProxyJump option of scp in this case:

scp -o 'ProxyJump opc@bastion-server' file.zip opc@private-server:~/


Attention OPatch 12.2.0.1.19: ‘opatch util cleanup’ command deletes files from /etc, /bin, /lib 2

On January 28th 2020 I reported to Oracle on a Severy 1 SR that the latest OPatch version tries to delete system files from /etc, /bin, /lib :

I wrote in the SR the use case, provided the logs. On my case, the opatch cleanup command “just” deleted the configuration files of my backup client in /etc (opatch deletes files that are writable by the user who calls the command – ‘oracle’/’grid’ in most of the cases).

The location from where ‘opatch cleanup util’ tries to delete files depends from where you call it! So, if you call from / or from $ORACLE_BASE or from $ORACLE_HOME/OPatch you’ll get different behaviours!

It took two months until I got an answer to that SR, saying a bug was created and it will be fixed on the next version of OPatch:

Today there was a release of April 2020 Security patches, the latest Oracle Patch Bundles. To my surprise, for Oracle versions 12.2, 18c and 19c, there is the requirement of using OPatch 12.2.0.1.19 version, the exact same version that has this dangerous bug!

The bug reported end of January was not yet fixed! At least, on OPatch readme there is now a warning:

So, be careful until then and do not use ‘opatch util cleanup’. Anyway, this command is not anymore needed, as a cleanup (without deleting any wrong file) of the $ORACLE_HOME/.patch_storage directory is automatically done at the end of the patching.


Oracle 19c – TFA and Oracle Restart (SIHA)

It took Oracle two months and some persuasion effort to make the analyst to read and believe what I’ve written on the Service Request, but the answer finally come from development:

TFA is disabled in all 19.* SIHA versions. Will let you know once enabled.
You can download TFA from MOS and install it.

My original request in February was:

I’m installing Oracle Restart in silent mode.

I see that TFA is provided both in GI and RDBMS homes under $ORACLE_HOME/suptools/tfa/release/tfa_home

How can I make it to install THIS VERSION OF TFA (and not downloaded from Oracle Support) automatically part of a silent mode installation?

At the moment, when it runs the root.sh from RDBMS installation, it calls both
. /u00/app/oracle/product/19.6.0/install/utl/rootmacro.sh "$@"
and
/u00/app/oracle/product/19.6.0/suptools/tfa/release/tfa_home/install/roottfa.sh

The rootmacro.sh sets:
INSTALL_TFA=0

On the output of the root.sh script there is:

Oracle Trace File Analyzer (TFA) is available at : /u00/app/oracle/product/19.6.0/bin/tfactl

but this does not exist:
-bash: /u00/app/oracle/product/19.6.0/bin/tfactl: No such file or directory

So, I repeat the question:
How can I make it to install the TFA provided part of GI/RDBMS home (and not downloaded from Oracle Support) AUTOMATICALLY and PART OF SILENT MODE INSTALLATION?


Oracle 19c – install JVM

When you did not install Java / JVM directly during database creation, you can do it in a later phase. This operation unfortunately is not so well documented. Here how I do it:

set echo on
alter pluggable database all open;
alter system set "_system_trig_enabled" = false scope=memory;
 
host $ORACLE_HOME/perl/bin/perl ${ORACLE_HOME}/rdbms/admin/catcon.pl -n 1 -l /tmp/jvm -b initjvm ${ORACLE_HOME}/javavm/install/initjvm.sql;
host $ORACLE_HOME/perl/bin/perl ${ORACLE_HOME}/rdbms/admin/catcon.pl -n 1 -l /tmp/jvm -b initxml ${ORACLE_HOME}/xdk/admin/initxml.sql;
host $ORACLE_HOME/perl/bin/perl ${ORACLE_HOME}/rdbms/admin/catcon.pl -n 1 -l /tmp/jvm -b catjava ${ORACLE_HOME}/rdbms/admin/catjava.sql;
select count(*), object_type from all_objects where object_type like '%JAVA%' group by object_type;
 
shutdown immediate
startup
host $ORACLE_HOME/perl/bin/perl $ORACLE_HOME/rdbms/admin/catcon.pl -n 1 -l /tmp/jvm -e -b utlrp $ORACLE_HOME/rdbms/admin/utlrp.sql
SELECT comp_name, version, status FROM dba_registry ORDER BY comp_name;
SELECT dbms_java.get_jdk_version JDK_Version FROM dual;


Oracle 19c – Install Spatial

Oracle Spatial is now free to users of all Oracle versions. To install, just run on your CDB:

host $ORACLE_HOME/perl/bin/perl $ORACLE_HOME/rdbms/admin/catcon.pl -n 1 -l /tmp/jvm -b install_spatial $ORACLE_HOME/md/admin/mdinst.sql
set lines 300
col comp_name for a50
SELECT comp_name, version, status FROM dba_registry ORDER BY comp_name;

Tales of Oracle 19c upgrade – ORA-24964: ALTER SESSION SET CONTAINER error

At my client we have a nice automation tool that can run scripts on several DBs based on their group on Enterprise Manager.

Last week I’ve migrated+upgraded OEM repository from 12.2 to 19c, using Refreshable PDB to copy the PDB from a 12.2 CDB to a 19c CDB + running dbupgrade script on the PDB (p_oem_t below). Everything worked nice.

My colleague yesterday tells me the automation tool is not working, giving “ORA-0942 Table or view does exist error” for a query on the OEM repository. This means to me, it can connect to the new repository, as the old one is shutdown. But where can the problem be?

I look at the code, find the query, run it on SQL Developer and all if fine, no errors.

This until I activate the debug modus of the automation tool and I see that it connects to the CDB as SYSDBA and then changes container to the right PDB. Just above the ORA-0942 error, there was a

ORA-24964: ALTER SESSION SET CONTAINER error

Looking around I could find this Metalink note:

ORA-24964 – Alter Session To 12.2 PDB From 12.1 Client (Doc ID 2494623.1)

The keyword on the “Symptoms” is “Upgraded DB”. The code description states:

Cause: An attempt to switch to a PDB with different settings such as character set, time zone or time zone file version on an Oracle 12c Release 1 (12.1) or earlier client failed. (see Docs)

In fact, our automation tool uses an old 11.2 client. To test it, I’ve created on the same 19c CDB also new fresh PDB (PDB01) and tried to connect remotely:

oracle@l-master $ sqlplus sys@\"l-oem19c_t/c_oem_t\" as sysdba

 SQL*Plus: Release 11.2.0.4.0 Production on Thu Jan 23 17:06:27 2020
 Enter password:
  
 Connected to:
 Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
  
 SQL> select pdb_name from dba_pdbs;
 PDB_NAME
 ------------------------------
 PDB$SEED
 P_OEM_T 
 PDB01
  
 SQL> alter session set container=PDB01;
 Session altered.
  
 SQL>  alter session set container=CDB$ROOT;
 Session altered.
  
 SQL> alter session set container=P_OEM_T;
 ERROR:
 ORA-24964: ALTER SESSION SET CONTAINER error 

When I tried the same but using a 19c client, it works without problem:

oracle@l-master-19c $ sqlplus sys@\"l-oem19c_t/c_oem_t\" as sysdba
  
 SQL*Plus: Release 19.0.0.0.0 - Production on Thu Jan 23 17:08:55 2020
 Version 19.3.0.0.0
  
 Copyright (c) 1982, 2019, Oracle.  All rights reserved.
  Enter password:
  
 Connected to:
 Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
 Version 19.4.0.0.0
  
 SQL> select pdb_name from dba_pdbs;
  
 PDB_NAME
 ---------------------------------
 PDB$SEED
 P_OEM_T
 PDB01
  
 SQL> alter session set container=PDB01;
 Session altered.
  
 SQL>  alter session set container=CDB$ROOT;
 Session altered.
  
 SQL> alter session set container=P_OEM_T;
 Session altered.


Install Oracle Locator in Oracle 19c 1

The documentation lacks a lot concerning this free feature. Only it says that was part of Oracle Multimedia which is now desupported, but Oracle Locator continues to be supported.

https://docs.oracle.com/en/database/oracle/oracle-database/19/spatl/oracle-locator.html#GUID-EC6DEA23-8FD7-4109-A0C1-93C0CE3D6FF2
https://docs.oracle.com/en/database/oracle/oracle-database/19/upgrd/behavior-changes-deprecated-desupport-oracle-database.html#GUID-BABC1C60-EA07-4EBE-8C67-B69B59E4F742

Well, after some try and error, checking scripts and so on, here is the simple installation of Oracle Locator in Oracle 19c, container architecture:

-- Connect sys to the Root container (CDB$ROOT) and run:
SQL> @?/md/admin/catmdloc.sql

-- Then connect to the PDB where you need Oracle Locator and run again:
SQL> alter session set container=MY_PDB 
SQL> @?/md/admin/catmdloc.sql

And that’s it. The script creates the MDSYS user without authentication, it uses the SYSAUX tablespace and creates a directory pointing to $ORACLE_HOME/md/admin and then tables, including external tables with sharing=object.

So, it is easier than before, no need to first create tablespace or user as it was the case in previous releases.

If you don’t run the script first on the root container, you’ll get errors regarding the external tables with sharing=object.


Check if Unified Audit is enabled without database running

I wanted to check if the binaries had Unified Audit enabled without using a database (because it was not yet created).

This is part of libknlopt.a library. Depending on which files are inside, it enables or disables options.

For Unified Auditing your can use:

case $(ar t ${ORACLE_HOME}/rdbms/lib/libknlopt.a | grep 'kza.ang.o') in "kzaiang.o") R="enabled" ;; "kzanang.o") R="disabled" ;; *) R="unknown" ;; esac && echo "Unified audit => $R"

For many other options (but not yet Unified Audit) you can check MOS note: How to Check and Enable/Disable Oracle Binary Options (Doc ID 948061.1)