Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

IM_DOMAIN$

A few months ago Franck Pachot wrote about a recursive SQL statement that kept appearing in the library cache. I discovered the note today because I had just found a client site where the following statement suddenly appeared near the top of the “SQL ordered by Executions” section of their AWR reports after they had upgraded to 18c.


select domain# from sys.im_domain$ where objn = :1 and col# = :2

I found Franck’s article by the simple expedient of typing the entire query into a Google search – his note was the first hit on the list, and he had a convenient example (based on the SCOTT schema) to demonstrate the effect, so I built the tables from the schema and ran a simple test with extended SQL tracing (event 10046) enabled.

Here’s an extract (with no deletions) from the resulting trace file:

PARSING IN CURSOR #139819795591784 len=110 dep=0 uid=104 oct=3 lid=104 tim=31306461773 hv=3572295767 ad='6bf8b8a0' sqlid='8n2bcc3aftu2r'
select /*+ leading(EMP DEPT) USE_HASH(DEPT) USE_HASH(BONUS) */ * from DEPT natural join EMP natural join BONUS
END OF STMT
PARSE #139819795591784:c=59,e=59,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=4262704031,tim=31306461772

PARSING IN CURSOR #139819795585328 len=64 dep=1 uid=0 oct=3 lid=0 tim=31306461966 hv=1240924087 ad='69a8b760' sqlid='0b639nx4zdzxr'
select domain# from sys.im_domain$ where objn = :1 and col# = :2
END OF STMT
PARSE #139819795585328:c=37,e=37,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=2321277860,tim=31306461965
EXEC #139819795585328:c=32,e=31,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=2321277860,tim=31306462058
FETCH #139819795585328:c=17,e=17,p=0,cr=1,cu=0,mis=0,r=0,dep=1,og=4,plh=2321277860,tim=31306462098
STAT #139819795585328 id=1 cnt=0 pid=0 pos=1 obj=10422 op='TABLE ACCESS BY INDEX ROWID IM_DOMAIN$ (cr=1 pr=0 pw=0 str=1 time=21 us cost=0 size=39 card=1)'
STAT #139819795585328 id=2 cnt=0 pid=1 pos=1 obj=10423 op='INDEX UNIQUE SCAN IM_DOMAIN_UK (cr=1 pr=0 pw=0 str=1 time=18 us cost=0 size=0 card=1)'
CLOSE #139819795585328:c=5,e=5,dep=1,type=1,tim=31306462287

EXEC #139819795591784:c=484,e=484,p=0,cr=1,cu=0,mis=0,r=0,dep=0,og=1,plh=4262704031,tim=31306462316
FETCH #139819795591784:c=0,e=804,p=0,cr=44,cu=0,mis=0,r=0,dep=0,og=1,plh=4262704031,tim=31306463191
STAT #139819795591784 id=1 cnt=0 pid=0 pos=1 obj=0 op='HASH JOIN  (cr=45 pr=0 pw=0 str=1 time=1222 us cost=72 size=97 card=1)'
STAT #139819795591784 id=2 cnt=4 pid=1 pos=1 obj=0 op='HASH JOIN  (cr=45 pr=0 pw=0 str=1 time=1001 us cost=70 size=232 card=4)'
STAT #139819795591784 id=3 cnt=4 pid=2 pos=1 obj=117764 op='TABLE ACCESS FULL EMP (cr=22 pr=0 pw=0 str=1 time=259 us cost=35 size=152 card=4)'
STAT #139819795591784 id=4 cnt=4 pid=2 pos=2 obj=117765 op='TABLE ACCESS FULL DEPT (cr=22 pr=0 pw=0 str=1 time=81 us cost=35 size=80 card=4)'
STAT #139819795591784 id=5 cnt=0 pid=1 pos=2 obj=117766 op='TABLE ACCESS FULL BONUS (cr=0 pr=0 pw=0 str=1 time=4 us cost=2 size=39 card=1)'
CLOSE #139819795591784:c=24,e=24,dep=0,type=1,tim=31306508552

PARSE #139819795591784:c=41,e=42,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=4262704031,tim=31306508798
PARSE #139819795585328:c=21,e=22,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=2321277860,tim=31306509010
EXEC #139819795585328:c=132,e=132,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,plh=2321277860,tim=31306509220
FETCH #139819795585328:c=20,e=19,p=0,cr=1,cu=0,mis=0,r=0,dep=1,og=4,plh=2321277860,tim=31306509415
CLOSE #139819795585328:c=8,e=8,dep=1,type=3,tim=31306509494
EXEC #139819795591784:c=682,e=704,p=0,cr=1,cu=0,mis=0,r=0,dep=0,og=1,plh=4262704031,tim=31306509558
FETCH #139819795591784:c=588,e=1246,p=0,cr=44,cu=0,mis=0,r=0,dep=0,og=1,plh=4262704031,tim=31306511014
CLOSE #139819795591784:c=23,e=22,dep=0,type=3,ti

As you can see, every time I do a parse call for the query against the SCOTT schema (PARSE #139819795591784), Oracle does a parse/exec/fetch/close for the query against im_domain$ (PARSE #139819795585328) – and this happens even when the SCOTT query is in the session cursor cache!

As Franck points out, this looks like something to do with the In Memory option even though the option wasn’t enabled in his database, and wasn’t enabled in my client’s database. Once you’ve got a reproducible example of a problem, though, you can start fiddling to see if you can bypass it. In this case I decided to check all the parameters to do with the in-memory option – which is a bit like hard work because there are 208 parameters that include the word “inmemory”. After checking the descriptions of the first twenty or so I decided there was an easier option – if Oracle is asking about “domains” for columns possibly it’s something to do with the relatively new “join group” feature for in-memory columnar compression, so I ran a query to produce the list of parameter names and description for parameter with the words “join” and “group” in their names – there are two:


_sqlexec_join_group_aware_hj_enabled              enable/disable join group aware hash join
_sqlexec_join_group_aware_hj_unencoded_rowset     minimum number of unencoded rowsets processed before adaptation 

The first one looks rather promising – and it has a default value to TRUE, and it can be changed by “alter session” or “alter system”. So I executed:


alter session set "_sqlexec_join_group_aware_hj_enabled" = false;
alter system flush shared_pool;

Then I ran my test again and voila! there it wasn’t. No more queries against in_domain$.

Problem solved (until the client decides they want to use the in-memory option, perhaps).

There may be other reasons why this recursive query appears which aren’t to do with hash joins, of course, but join groups are specifically to allow efficient hash joins with the in-memory option, (it’s a trick to do with common encoding for compression to allow Bloom filtering to eliminate CUs without decoding) so I’m hoping I won’t have to track down and eliminate another sources for the query.

 

Oracle Database 19c Automatic Indexing: Predicted Back In 2008 (A Better Future)

I’ve recently received a number of correspondences regarding one of my most popular blog posts, dating back to February 2008: Index Rebuild vs. Coalesce vs. Shrink Space (Pigs – 3 Different Ones). In the comments section, there’s an interesting discussion where I mention: “If Oracle19 does everything for you and all the various indexes structures get […]

High CPU usage in docker-proxy with chatty database application? Disable userland-proxy!

Or just keep database and application co-located :)

It is well-known from the get-go, but very often overlooked because of ignorance or laziness: the database application must be co-located with the database server. Row-by-row roundtrips between the application and the database are expensive. Not only due to the network latency, but also because of the many CPU cycles wasted to switch the context between the two engines, or the two processes, and maybe the two servers.

In modern architectures, with microservices and containers, this means that a business service must be implemented in one microservice containing the business logic and the business data. Separating the application and the database into two microservices is a wrong design, non-efficient, non-scalable, and also non-green because of the unnecessary CPU usage.

Docker

I was building a new demo for this, as in the previous post, where I compare running the procedural code in the client or the server side of the database. When I was running my database in a Docker container, I’ve seen that the bad performance I wanted to show was even worse than expected:

  • the symptom was high CPU usage in “docker-proxy” process
  • the cause was that I’m using the default Docker userland proxy

Here is the related Twitter thread. Thanks to @G_Ceresa, @ochoa_marcelo, and @ofirm for the quick replies about the cause and solution:

SQL, PL/SQL and JavaScript running in the Database Server (Oracle MLE)

In a previous post I measured the CPU usage when running a database transaction in the same engine (SQL), or two engines in the same process (PL/SQL + SQL or JavaScript + SQL) or two processes (Javascript client + server SQL):

ODC Appreciation Day : Reduce CPU usage by running the business logic in the Oracle Database

For the JavaScript + SQL running in the same process, I used the Oracle Multi-Lingual Engine in beta 0.2.7 but there is now a new beta 0.3.0 and this post runs the same (or similar) with this.

I’ve installed this MLE in a previous post:

Oracle Multi-Lingual Engine

And here is the demo where I run 400000 amount transfers between accounts. Here are the tables:

SQLcl: Release 18.4 Production on Sun Mar 17 15:42:34 2019
Copyright (c) 1982, 2019, Oracle. All rights reserved.
Last Successful login time: Sun Mar 17 2019 15:42:36 +00:00
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0_MLE - 64bit Beta
15:42:39 SQL> create table CUSTOMERS (
CUSTOMER_ID number generated always as identity
constraint CUSTOMER_PK primary key,
CUSTOMER_NAME varchar2(42)
);
Table created.
15:42:42 SQL> create table ACCOUNTS (
ACCOUNT_ID varchar2(10) constraint ACCOUNT_PK primary key,
CUSTOMER_ID number,
AMOUNT number default 0
);
Table created.
15:42:46 SQL> insert /*+ append */ into CUSTOMERS (CUSTOMER_NAME)
select x from (
select to_char( date'-4712-01-01'+rownum-1,'Jsp') x
from xmltable('1 to 1000000')
) where length(x)=42 and rownum<=4000;
4000 rows created.
15:42:49 SQL> commit;
Commit complete.
15:42:51 SQL> select * from CUSTOMERS 
order by CUSTOMER_ID fetch first 10 rows only;
  CUSTOMER_ID CUSTOMER_NAME
----------- ------------------------------------------
1 Three Thousand Three Hundred Seventy-Three
2 Three Thousand Three Hundred Seventy-Seven
3 Three Thousand Three Hundred Seventy-Eight
4 Three Thousand Seven Hundred Seventy-Three
5 Three Thousand Seven Hundred Seventy-Seven
6 Three Thousand Seven Hundred Seventy-Eight
7 Three Thousand Eight Hundred Seventy-Three
8 Three Thousand Eight Hundred Seventy-Seven
9 Three Thousand Eight Hundred Seventy-Eight
10 Seven Thousand Three Hundred Seventy-Three
10 rows selected.
15:42:54 SQL> insert /*+ append */ into ACCOUNTS                   
(ACCOUNT_ID,CUSTOMER_ID,AMOUNT)
select 'X'||to_char(rownum,'FM0999999'),CUSTOMER_ID,10000
from CUSTOMERS cross join xmltable('1 to 100')
;
400000 rows created.
15:42:57 SQL> commit;
Commit complete.
15:42:58 SQL> commit;
Commit complete.
15:43:15 SQL> select * from ACCOUNTS 
order by ACCOUNT_ID fetch first 10 rows only;
ACCOUNT_ID     CUSTOMER_ID   AMOUNT
---------- ----------- ------
X0000001 1150 10000
X0000002 1151 10000
X0000003 1152 10000
X0000004 1153 10000
X0000005 1154 10000
X0000006 1155 10000
X0000007 1156 10000
X0000008 1157 10000
X0000009 1158 10000
X0000010 1159 10000
10 rows selected.
15:43:16 SQL> select /*+ full(ACCOUNTS) cache(ACCOUNTS) */ 
count(*),avg(amount) from ACCOUNTS;
  COUNT(*)   AVG(AMOUNT)
-------- -----------
400000 10000

I have a ‘show-cpu-seconds-from-ps.sh’ script that displays the cputime delta from ps output.

Here I run all in one SQL statement: 5 seconds of CPU

15:44:23 SQL> set timing on
15:44:28 SQL> host sh /tmp/sqlcl/show-cpu-seconds-from-ps.sh init to reset cputime counter
15:44:30 SQL> update ACCOUNTS set AMOUNT=
2 case
3 when ACCOUNT_ID='X0000001' then AMOUNT+(select 1*count(*) from ACCOUNTS where ACCOUNT_ID<>'X0000001')
4 else AMOUNT-1
5 end
6 /
400000 rows updated.
Elapsed: 00:00:04.451
15:44:43 SQL> host sh /tmp/sqlcl/show-cpu-seconds-from-ps.sh diff to show the delta cputime
5 cpu seconds in pid= 19971 oracleCDB1 (LOCAL=NO)
15:44:43 SQL> select * from ACCOUNTS order by ACCOUNT_ID fetch first 10 rows only;
ACCOUNT_ID CUSTOMER_ID AMOUNT
X0000001 1150 409999
X0000002 1151 9999
X0000003 1152 9999
X0000004 1153 9999
X0000005 1154 9999
X0000006 1155 9999
X0000007 1156 9999
X0000008 1157 9999
X0000009 1158 9999
X0000010 1159 9999
10 rows selected.
Elapsed: 00:00:00.019
15:44:43 SQL> rollback;
Rollback complete.
Elapsed: 00:00:04.158

This is the actual CPU cycles needed to update those 400000 account amounts: 5 seconds. And the rollback is the same.

Now with a PL/SQL procedure: 30 seconds of CPU (because of the context switches between the PL/SQL and SQL engines)

15:44:47 SQL> create or replace procedure transfer(acc1 varchar2, acc2 varchar2, amount number) as
2 begin
3 -- debit acc1
4 update ACCOUNTS set ACCOUNTS.AMOUNT = ACCOUNTS.AMOUNT - transfer.amount where ACCOUNT_ID=acc1;
5 if sql%rowcount <> 1 then raise_application_error(-20000,'Account '''||acc1||''' unknown'); end if;
6 -- credit acc2
7 update ACCOUNTS set ACCOUNTS.AMOUNT = ACCOUNTS.AMOUNT + transfer.amount where ACCOUNT_ID=acc2;
8 if sql%rowcount <> 1 then raise_application_error(-20000,'Account '''||acc2||''' unknown'); end if;
9 end;
10 /
Procedure created.
Elapsed: 00:00:00.113
15:46:11 SQL> desc transfer
PROCEDURE transfer
Argument Name Type In/Out Default?
ACC1 VARCHAR2 IN
ACC2 VARCHAR2 IN
AMOUNT NUMBER IN
15:46:38 SQL> set timing on
15:46:41 SQL> host sh /tmp/sqlcl/show-cpu-seconds-from-ps.sh init to reset cputime counter
15:46:43 SQL> exec for c in (select * from ACCOUNTS where ACCOUNT_ID<>'X0000001') loop transfer(c.ACCOUNT_ID,'X0000001',1); end
loop;
PL/SQL procedure successfully completed.
Elapsed: 00:00:30.283
15:47:15 SQL> host sh /tmp/sqlcl/show-cpu-seconds-from-ps.sh diff to show the delta cputime
30 cpu seconds in pid= 19971 oracleCDB1 (LOCAL=NO)
15:47:38 SQL> select * from ACCOUNTS order by ACCOUNT_ID fetch first 10 rows only;
ACCOUNT_ID CUSTOMER_ID AMOUNT
X0000001 1150 409999
X0000002 1151 9999
X0000003 1152 9999
X0000004 1153 9999
X0000005 1154 9999
X0000006 1155 9999
X0000007 1156 9999
X0000008 1157 9999
X0000009 1158 9999
X0000010 1159 9999
10 rows selected.
Elapsed: 00:00:00.015
15:47:43 SQL> rollback;
Rollback complete.
Elapsed: 00:00:04.266

Now with a JavaScript client: 157 seconds of CPU (in the same database session process, but two engines).

15:48:38 SQL> disconnect
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.1.0_MLE - 64bit Beta
15:48:54 SQL> script
2 var DriverManager = Java.type("java.sql.DriverManager");
3 if ( ! con === undefined ) { con.rollback(); con.close(); }
4 var con = DriverManager.getConnection("jdbc:oracle:thin:@//localhost/PDB1","demo","demo");
5 con.setAutoCommit(false);
6 var sql = con.createStatement();
7 .
15:49:10 SQL> save script01-init.js replace
Wrote file script01-init.js
15:49:16 SQL> @ script01-init.js
Elapsed: 00:00:01.019
15:49:18 SQL> script
2 print("First 10 accounts:");
3 var res=sql.executeQuery(" select ACCOUNT_ID,AMOUNT from ACCOUNTS order by 1 fetch first 10 rows only");
4 while(res.next()){print(" ACCOUNT_ID: "+res.getString(1)+" "+"AMOUNT: "+res.getString(2)); }
5 .
15:49:33 SQL> save script02-query.js replace
Wrote file script02-query.js
15:49:35 SQL> @ script02-query.js
First 10 accounts:
ACCOUNT_ID: X0000001 AMOUNT: 10000
ACCOUNT_ID: X0000002 AMOUNT: 10000
ACCOUNT_ID: X0000003 AMOUNT: 10000
ACCOUNT_ID: X0000004 AMOUNT: 10000
ACCOUNT_ID: X0000005 AMOUNT: 10000
ACCOUNT_ID: X0000006 AMOUNT: 10000
ACCOUNT_ID: X0000007 AMOUNT: 10000
ACCOUNT_ID: X0000008 AMOUNT: 10000
ACCOUNT_ID: X0000009 AMOUNT: 10000
ACCOUNT_ID: X0000010 AMOUNT: 10000
Elapsed: 00:00:00.181
15:49:37 SQL> script
2 var pre1=con.prepareStatement(" update ACCOUNTS set AMOUNT=AMOUNT-? where ACCOUNT_ID=?");
3 var pre2=con.prepareStatement(" update ACCOUNTS set AMOUNT=AMOUNT+? where ACCOUNT_ID=?");
4 function transfer (acc1,acc2,amount) {
5 pre1.setInt(1,amount); pre1.setString(2,acc1); pre1.execute();
6 pre2.setInt(1,amount); pre2.setString(2,acc2); pre2.execute();
7 }
8 var res=sql.executeQuery(" select ACCOUNT_ID from ACCOUNTS where ACCOUNT_ID<>'X0000001'");
9 print("Calling transaction for each account...");var t0=new Date();var cnt=0;
10 while(res.next()){ transfer(res.getString(1),'X0000001',1); cnt++ }
11 print(cnt+" transactions executed in "+(new Date() - t0)/1000+" seconds");
12 .
15:50:17 SQL> save script02-run.js replace
Wrote file script02-run.js
15:50:18 SQL> set timing on
15:50:22 SQL> host sh /tmp/sqlcl/show-cpu-seconds-from-ps.sh init to reset cputime counter
15:50:25 SQL> @    script02-run.js
Calling transaction for each account...
399999 transactions executed in 138.016 seconds
Elapsed: 00:02:18.082
15:52:45 SQL> host sh /tmp/sqlcl/show-cpu-seconds-from-ps.sh diff to show the delta cputime
52 cpu seconds in pid= 19945 /opt/oracle/product/12.2.0.1/dbhome_1/jdk/jre/bin/java -Djava.awt.headless=true -Dappl
e.awt.UIElement=true -Xss10M -client
105 cpu seconds in pid= 20426 oracleCDB1 (LOCAL=NO)
15:52:56 SQL> @    script02-query.js
First 10 accounts:
ACCOUNT_ID: X0000001 AMOUNT: 409999
ACCOUNT_ID: X0000002 AMOUNT: 9999
ACCOUNT_ID: X0000003 AMOUNT: 9999
ACCOUNT_ID: X0000004 AMOUNT: 9999
ACCOUNT_ID: X0000005 AMOUNT: 9999
ACCOUNT_ID: X0000006 AMOUNT: 9999
ACCOUNT_ID: X0000007 AMOUNT: 9999
ACCOUNT_ID: X0000008 AMOUNT: 9999
ACCOUNT_ID: X0000009 AMOUNT: 9999
ACCOUNT_ID: X0000010 AMOUNT: 9999
Elapsed: 00:00:00.015
15:53:13 SQL> script
2 con.rollback();
3 con.close();
4 .
15:53:20 SQL> save script02-close.js replace
Wrote file script02-close.js
15:53:22 SQL> @ script02-close.js
Elapsed: 00:00:06.198

And finally running JavaScript in the MLE engine: 223 seconds of CPU. This MLE, in beta, may not be fully optimized, so the time is not very relevant. The point is that the whole is running 100% in the same process.

15:53:31 SQL> connect demo/demo@//localhost/pdb1
Connected.
15:55:34 SQL> create or replace javascript source named "demo.js" as
2 function transfer (acc1,acc2,amount) {
3 var sql = _dbRequire('@oracle/sql');
4 sql.execute(" update ACCOUNTS set AMOUNT=AMOUNT-:amount where ACCOUNT_ID=:acc1",[amount,acc1]);
5 sql.execute(" update ACCOUNTS set AMOUNT=AMOUNT+:amount where ACCOUNT_ID=:acc2",[amount,acc2]);
6 }
7 module.exports.run = function () {
8 var sql = _dbRequire('@oracle/sql');
9 var res=sql.execute(" select ACCOUNT_ID from ACCOUNTS where ACCOUNT_ID<>'X0000001'");
10 for (var row of res.rows) {
11 transfer(row[0],'X0000001',1);
12 }
13 }
14 /
Function created.
Elapsed: 00:00:00.013
15:56:02 SQL> create or replace procedure run as language javascript
2 name 'demo\.js.run()';
3 /
Procedure created.
Elapsed: 00:00:00.032
15:56:14 SQL> select * FROM user_libraries;
no rows selected
Elapsed: 00:00:00.122
15:56:19 SQL> select object_name,procedure_name,object_type from user_procedures;
OBJECT_NAME PROCEDURE_NAME OBJECT_TYPE
RUN PROCEDURE
TRANSFER PROCEDURE
Elapsed: 00:00:00.291
15:56:21 SQL> select object_name,procedure_name,object_type from user_procedures;
OBJECT_NAME PROCEDURE_NAME OBJECT_TYPE
RUN PROCEDURE
TRANSFER PROCEDURE
Elapsed: 00:00:00.012
15:56:36 SQL> set timing on
15:56:51 SQL> host sh /tmp/sqlcl/show-cpu-seconds-from-ps.sh init to reset cputime counter
15:56:53 SQL> call demo.run();
Call completed.
Elapsed: 00:03:32.463
16:00:28 SQL> host sh /tmp/sqlcl/show-cpu-seconds-from-ps.sh diff to show the delta cputime
223 cpu seconds in pid= 20761 oracleCDB1 (LOCAL=NO)

The important point with this MLE engine is that you can write your code without being tied to a platform. You write code (or use libraries of code, copy/paste from StackOverflow…) in the latest trendy languages (JavaScript and Python for the moment, whatever in the future). And it can run on the client, the application server, or in the database. Then, the best colocation of code can be achieved without duplicating the logic into different languages. In summary, the developer thinks “serverless” for simplicity and agility and the operations run “full server” for efficiency and scalability.

But that’s for the future. Follow the MLE and be involved in the community:

Space: Multilingual Engine | Oracle Community

For your curiosity, where is the perf-top for the last run in MLE showing the work in oracle and in libmle.so engines:

Oracle Multi-Lingual Engine

PI-Day example

Here is a very quick and easy test of the Oracle MLE, the engine that let you run JavaScript or Python stored procedures in the Oracle Database (currently in beta).

The MLE beta is provided as a docker image containing the Oracle Database 12.2 with the additional MLE libraries. I have created a VM on the Oracle Cloud in order to test it and show an end-to-end demo on Oracle Linux 7.

Get software

Here is where to download the database server with MLE beta:

Oracle Database MLE Download

and the SQLcl client

SQLcl Downloads

I’ll download them from my VM where I have no browser,

but downloading Oracle software requires a human intervention to accept the license terms. Then, what I do is start the download from my laptop (where I can sign-in with my Oracle account and accept the license), suspend the download and copy the link which contains an ‘’AuthParam value. Then I can use this URL to download it with wget in my VM, in the /tmp directory:

Once I have this in the /tmp directory, I install and start docker, unzip SQLcl and create the MLE container:

sudo su
# install docker and java from OL7 Add-Ons
yum-config-manager --enable ol7_addons
yum install -y docker-engine docker-compose java
#start docker
systemctl start docker
docker info | grep -A2 ^Storage
# install SQLcl, the client to connect to the database
unzip -d /tmp /tmp/sqlcl-18.4.0.007.1818.zip*
# start the MLE beta
docker load --input /tmp/mle-docker-0.3.0.tar.gz*
cat > /tmp/mle.yml <<'CAT'
version: '3.1'
services:
ora122MLE:
image: mle-docker:0.3.0
restart: always
environment:
ORACLE_SID: CDB1
ORACLE_PDB: PDB1
ORACLE_PWD: oracle
ports:
- 1521:1521
volumes:
- "/var/tmp/oradata:/opt/oracle/oradata"
- "/tmp:/home/oracle/myproject"
CAT
mkdir -p /var/tmp/oradata
docker-compose -f /tmp/mle.yml up

I could have built a docker image containing SQLcl but I’m not a container junkie. My OS is a virtual machine, SQLcl runs in a Java VM, do I really need to containerize this in between two VMs? Only the database runs in a container because this is how the beta is shipped.

[Update 17-MAR-2019] This is ok only with few round-trips between client and server. The implementation of docker-proxy to connect from host to container is not efficient at all, and then better install SQLcl in the container itself or another linked container.

The creation of the container, at docker-compose up, is long. Be patient. This is how database docker images work: they have to create the database when creating the container.

The reason is that a database stores persistent data, and then can be created only once the container started. And an Oracle database is a big beast to create. Here, I put the database in /var/tmp and the ‘myproject’ directory is mapped to /tmp.

The MLE beta is based on the official docker images build scripts provided by Oracle. Everything is fine when you see “DATABASE IS READY TO USE”

Let’s check with the password I defined in the docker-compose file, and create a DEMO user in the pluggable database:

/tmp/sqlcl/bin/sql sys/oracle@//localhost/pdb1 as sysdba
connect sys/oracle@//localhost/pdb1 as sysdba
grant dba to demo identified by demo;
connect demo/demo@//localhost/pdb1

The documentation of the MLE beta is at: https://oracle.github.io/oracle-db-mle/docker. As I’m doing this on 14th or March, which is known as PI-Day in the countries which use the MM/DD format, I’ll create a PI function which can round to 2 decimals:

connect demo/demo@//localhost/pdb1
create javascript source named "test1.js" as
module.exports.PI=function (d) {
return Math.round(10**d*Math.PI)/10**d;
}
/
create function PI(decimals in number)
return number as language javascript
name 'test1\.js.PI(decimals number) return number';
/
select PI(2) from dual;

In the previous beta, we had to use an external utility (dbjs) to load the JavaScript code. It is now much easier with the CREATE SOURCE statement.

I’m now ready with a MLE environment for further tests…

Migrating DB2 Databases to Azure

Yep, still doing a lot of database migrations. Just too many people wanting to migrate their other database platforms over to Azure…

I have two customers that have DB2 databases and I know how overwhelming it can be to take on a project like this, so I thought I would go over the high level steps to this project to demonstrate it’s a lot easier than many first may believe. The SQL Server Migration Assistant is your friend and can take a lot of the hardship out of migration projects.

The overall steps to the migration are as follows:


Questions

The first step is to ask the right questions coming into a project to determine the complexity of the migration. DB2 has a number of unique scenarios that you won’t find in other database platforms, such as PL/SQL integration to attract Oracle customers and a large set of supported Operating Systems. These are essential part of the initial questions that must be asked to prepare for in the project.

The following is a guideline of questions I would start with, but not limited to, in the first round of DB2 migration questions:

  1. Is the desire to migrate the databases to Azure VMs, retaining the existing databases and simply bringing over the licenses?

If not…

  1. What version of DB2, (or 2017 Db2) are the databases that will be migrated to Azure?
  2. What OS version of DB2, including Mainframe, LUW, (Linux/Unix/Windows) or i/z
  3. If the database is post 9.7, is there PL/SQL that’s been introduced to the code base?
  4. How much data will be migrated, (as in size and # of schemas)?
  5. Any unique, advanced data types?
  6. Is there any SQL or database code built into the application tier, (or outside of the database, all outside what the migration assistant can access)?

 

Translations

As with any database platform migration, there is terms that must be translated and you also need to be aware of any terms that may have different definitions between the platforms. The term, “instance”, has a different meaning than it does in SQL Server/Azure. Where a SQL Server instance is the installation of the SQL Server “bin” files, the DB2 instance is the default parent database engine, named by default to DB2. Identifying any naming conventions that may have been used differently or confused when migrating is important to deter from mistakes in communication/tasks assigned.

Some important terms to know in DB2 are:

Routines: collection of stored procedures and functions that run inside a DB2 database.

PL/SQL Collection: Specific SQL language elements built into DB2 to attract Oracle customers. Originally built into Oracle as it’s procedural language extension of SQL.

Connection Concentrator: An agent based connector, similar to a listener, but is able to scale and build out 1000’s of web/client connections via applications to the database while only requiring a few physical threads to the actual database.

Materialized Query Tables: This is a reporting table that is based off of a query in its creation statement. They are often used in data mart and warehouse databases.

Multi-dimensional Clustering Tables: The ability to cluster a table along multiple dimensions, creating better performance upon querying, especially those queries that commonly use the same columns in the where clause.

Preparation

You’ll need the following software to perform the migration in most scenarios, (outside of a unique connector if you want to perform a pull from the SQL Server, etc.):

  • SQL Server Migration Assistant, (SSMA)
  • DB2 Extension Pack for SSMA
  • OLEDB driver for SQL Server

 

High Level Steps:

The next steps are what happens during the migration. It’s a two step process, first migrating the metadata, schemas and verifying all data type conversions are done successfully before you approve and proceed to the second migration step of migrating the data.

DDL, Schema and Object Migration

  • Install the OLEDB driver, SSMA and DB2 Extension Pack on the destination server.
  • Connect the SSMA to the DB2 server and database.
  • Create a new migration project in SSMA
  • Run an assessment report on the migration of the DDL and schemas.
  • Identify any failures in the DDL, data type conversions or object migration estimates.
  • Identify each schema and map to a new user database, (or if SQL schema design desired, follow steps for creation.)
  • Once resolved, rerun the report until no errors are present or a manual workaround is documented, (actually a rare requirement, see best practices section of post).
  • Run SSMA migration of DDL and objects from DB2 to Azure.
  • Review log to verify all successful.

 

Data Migration

  • In same project as you ran the DDL migration, click on migrate data.
  • The SSMA will identify each schema and map the data to each user database or to unique schemas, depending on configuration.
  • Will perform another assessment report of data migration for review.
  • If satisfied, (as in all data will migrate successfully with the data type format, etc.) then run the data migration.
  • You have the option to remove any tables not deemed as good candidates for the migration assistant to move over, (see Tips below).
  • The SSMA will migrate the data, table by table into the new database(s).
  • Review the migration log post the completion.


Tips for DB2 Database Migrations to Azure

  • Review the assessment report fully for any issues.
  • Correct issues before bringing data over.
  • The migration assistant will re-attempt a table migration three times before it fails. IF it does have to re-attempt, you should see more than one table in the new database. The naming convention is to follow the original table name with the ‘$’ and a number. You will note those that had to be re-attempted by seeing a
    $1 and
    $2, etc.
  • If you have a very large table and don’t want the migration assistant to move the data, you can remove it from the list to be migrated and post the migration, use a DB2 dump to export the data to a flat file and then use Bulk Copy Protocol, (BCP) to dump the data in. BCP is a fast, effective command line process to load data into a SQL Server database, (other processes to do so are available, too.)



Tags:  , ,


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBAKevlar [Migrating DB2 Databases to Azure], All Right Reserved. 2019.

Oracle stored procedure compilation errors displayed for humans

Here is a script I use a lot especially when importing a schema with Data Pump and checking for invalid objects. I usually don’t care about compilation errors at compile time but just run UTL_RECOMP.RECOMP_PARALLEL at the end and check for errors on invalid objects. Here is an example.

I have imported a schema with Data pump and got some compilation errors:

I want to resolve them, or at least to understand them.

If I query DBA_ERRORS, I get the following:

This is a small example, but it can be huge. Not very helpful:

  • I have some error messages like “PL/SQL: ORA-00942: table or view does not exist” but with no mention of wich table. I have to go to the source with line and position.
  • I have some other errors which mention an object which is invalid, but the reason may be a consequence of the previous one. Then I want to see only the previous one.

This is why I use the following query to filter only top-level errors, lokking at DBA_DEPENDENCY to get the first ones, and show the line from DBA_SOURCE in order to give a better idea.

Here is my output for the same errors:

I can see clearly that I have only one object in error, which has a SELECT statement ignored because the table ELEMENT_INFO is not visible. No need to dig into the code, just checking why this table is not there.

Here is the script, just mention the schema name instead of ‘%’:

with 
schemas as (select username owner from dba_users where
username like '%'
),
errors as (
select * from dba_errors natural join schemas
),
top_errors as (
-- only top errors in dependency
select * from errors
where (owner,name,type) not in (
select owner,name,type
from dba_dependencies
where (referenced_owner,referenced_name,referenced_type)
in (select owner,name,type from errors)
)
)
-- here is the select to join with dba_source
select /*+ first_rows(1) */ decode(n,-1,'* ',' ')||text text from (
-- name/type of the object in error
select
distinct -1 n,owner,name,type,line
,type||' '||owner||'.'||name||' line '||line text
from top_errors
union all
-- line with error (from dba source)
select 0 n,owner,name,type,line,text
from dba_source where (owner,name,type,line)
in ( select owner,name,type,line from top_errors)
union all
-- error message with indication of the position in the line
select sequence n,owner,name,type,line
,lpad(' ',position-1,' ')||'^'||text text
from top_errors
) order by owner,name,type,line,n;

I’ll be happy to have your feedback on Twitter: https://twitter.com/FranckPachot/status/1106246663096713222

Announcement: “Oracle Performance Diagnostics and Tuning” Seminar – Australia/NZ Winter Dates

I’m very excited to announce the first public running of my new “Oracle Performance Diagnostics and Tuning” Seminar throughout Australia and New Zealand this coming winter 2019. (See my Oracle Performance Diagnostics and Tuning Seminar page for all the seminar content and details). This is a must attend seminar aimed at Oracle professionals (both DBAs […]

Hash Partitions

Here’s an important thought if you’ve got any large tables which are purely hash partitioned. As a general guideline you should not need partition level stats on those tables. The principle of hash partitioned tables is that the rows are distributed uniformly and randomly based on the hash key so, with the assumption that the number of different hash keys is “large” compared to the number of partitions, any one partition should look the same as any other partition.

Consider, as a thought experiment (and as a warning), a table of product_deliveries which is hash partitioned by product_id with ca. 65,000 distinct products that have been hashed across 64 partitions. (Hash partitioning should always use a power of 2 for the partition count if you want the number of rows per partition to be roughly the same across all partitions – if you don’t pick a power of two then some of the partitions will be roughly twice the size of others.)

Consider a query for “deliveries to Basingstoke” – in the absence of a histogram on the delivery location the optimizer will produce a cardinality estimate that is:

  • total rows in table / number of distinct delivery locations in table

Now consider a query for: “deliveries of product X to Basingstoke” – again in the absence of histograms. The optimizer could have two ways of calculating this cardinality:

  • total rows in table / (number of distinct products in table * number of distinct delivery locations in table)
  • total rows in relevant partition / (number of distinct products in relevant partition * number of distinct delivery locations in relevant partition)

But given the intent of hash partitioning to distribute data evenly we can make three further observations:

  1. the number of rows in any one partition should be very similar to the number of rows in the table divided by the number of partitions
  2. the number of distinct products in any one partition should be very similar to the number of products in the table divided by the number of partitions
  3. the number of distinct locations in any one partition should be very similar to the number of distinct locations in the whole table.

The second condition holds because product is the partition key, the third holds because location is not the partition key.

So we can rewrite the second, partition-oriented, formula as:

  • (total rows in table / number of partitions) / ((number of distinct products in table / number of partitions) * number of distinct locations in table)

which, re-arranging parentheses and cancelling common factors, reduces to:

  • total rows in table / (number of distinct products in table * number of distinct locations in table)

which matches the first formula. (Q.E.D.) In the absence of any statistics on hash partitions the optimizer can (ought to be able to) produce reasonable cardinality estimates based purely on table-level stats.

In fact if you look back into the history of partitioning this observation is implicit in the early days of composite partitioning when the only option was for range/hash composite partitions – the optimizer never used sub-partition stats to calculate costs or cardinality it used only partition-level statistics. (And it was several years before the optimizer caught up to the fact that (e.g.) range/list composite partitioning might actually need to do arithmetic based on subpartition stats.)

I did say that the example was also a warning. Hash partitioning is “expected” to have a large number of distinct key values compared to the number of partitions. (If you don’t meet this requirement then possibly you should be using list partitioning). There’s also a “uniformity” assumption built into the arithmetic (both the basic arithmetic and the hand-waving discussion I produced above). Just imagine that your company supplies a handful of products that for some strange reason are incredibly popular  in Basingstoke. If this is the case then the assumption that “all partitions look alike” is weakened and you would have to consider the possibility that the variation would require you to produce a workaround to address problems of poor cardinality estimates that the variation might produce.

A pattern of this type has two generic effects on the optimizer, of course. First is the simple skew in the data – to have a significant impact the number of rows for the problem products would have to be much larger than average, which suggests the need for a suitably crafted histogram; secondly there’s an implied correlation between a few products and Basingstoke, so you might even end up creating a column group and manually coding a histogram on it to capture the correlation.

 

dbca now makes it easy to configure OMF on file systems

Up until – and including – Oracle 12.1 I always found it not-quite-so-straight-forward to create a new database using Database Creation Assistant (dbca) and configure it with Oracle Managed Files (OMF) on a file system in silent mode. I really like to use OMF in my lab databases as it saves me a lot of typing. I have also seen Oracle databases deployed in the cloud on file systems without ASM. So I was quite happy to see the syntax for dbca -silent -createDatabase was extended.

This post has been written using Oracle 18.4.0 on Linux.

Why might you want to do that?

When writing code to automatically provision databases you can’t have any interactive parts if you want an end-to-end provisioning workflow. Oracle thankfully provides options to call many GUI tools on the command line in so-called silent mode as well. This includes, but is not limited to, dbca. You can invoke the tool with a help flag, and it’ll tell you about its many options. They are so numerous that I refrain from showing them here, but you can get them from the Administration Guide.

And how does it work?

Using the minimum number of relevant options I can easily create a database using dbca and tell it to use OMF, pointing to a file system destination instead of ASM. Your choice of flags is most likely quite different from mine. This one worked for me in the lab:

[oracle@server1 ~]$ dbca -silent -createDatabase -gdbName NCDB -templateName lab_database.dbc \
-enableArchive true -archiveLogMode auto -memoryMgmtType auto_sga \
-createAsContainerDatabase false -recoveryAreaDestination /u01/fast_recovery_area \
-useOMF true -datafileDestination /u01/oradata -totalMemory 2048
[…]
Prepare for db operation
10% complete
Copying database files
40% complete
Creating and starting Oracle instance
42% complete
46% complete
50% complete
54% complete
60% complete
Completing Database Creation
66% complete
69% complete
70% complete
Executing Post Configuration Actions
100% complete
Database creation complete. For details check the logfiles at:
/u01/app/oracle/cfgtoollogs/dbca/NCDB.
Database Information:
Global Database Name:NCDB
System Identifier(SID):NCDB
Look at the log file "/u01/app/oracle/cfgtoollogs/dbca/NCDB/NCDB.log" for further details.
[oracle@server1 ~]$

You might have noticed the options to enable archivelog-mode as well. I’ll write about these in the next post :) Oh and the database is really created using OMF:

SQL> select name from v$tempfile union all
2 select name from v$datafile union all
3 select member from v$logfile union all
4 select value from v$parameter2 where name = 'control_files';

NAME
------------------------------------------------------------
/u01/oradata/NCDB/datafile/o1_mf_temp_g3gt1t7w_.tmp
/u01/oradata/NCDB/datafile/o1_mf_system_g3gsyght_.dbf
/u01/oradata/NCDB/datafile/o1_mf_sysaux_g3gszkmp_.dbf
/u01/oradata/NCDB/datafile/o1_mf_undotbs1_g3gt0bq9_.dbf
/u01/oradata/NCDB/datafile/o1_mf_users_g3gt0ct3_.dbf
/u01/oradata/NCDB/onlinelog/o1_mf_3_g3gt1lpw_.log
/u01/fast_recovery_area/NCDB/onlinelog/o1_mf_3_g3gt1mp8_.log
/u01/oradata/NCDB/onlinelog/o1_mf_2_g3gt1lp4_.log
/u01/fast_recovery_area/NCDB/onlinelog/o1_mf_2_g3gt1mqt_.log
/u01/oradata/NCDB/onlinelog/o1_mf_1_g3gt1los_.log
/u01/fast_recovery_area/NCDB/onlinelog/o1_mf_1_g3gt1mq0_.log
/u01/oradata/NCDB/controlfile/o1_mf_g3gt1j3w_.ctl
/u01/fast_recovery_area/NCDB/controlfile/o1_mf_g3gt1j5o_.ctl

Final thoughts

Until now it’s been quite difficult (at least for me) to enable OMF on a file system when using dbca in CLI mode. This is no longer the case.