Top 60 Oracle Blogs

Recent comments

September 2011

Interview with Marco Gralike

As a placeholder here a small post regarding my interview about Oracle XMLDB functionality and its use-cases via the Dutch Oracle magazine called “Optimize” for September.  

Source AMIS Nieuwbrief (Geinteresseerd? Aanmelden hier)

AMIS in de pers – Marco Gralike

Upgrade Argh

Time for another of those little surprises that catch you out after the upgrade.
Take a look at this “Top N” from a standard AWR report, from an instance running

Top 5 Timed Foreground Events
                                                          wait   % DB
Event                                 Waits     Time(s)   (ms)   time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
direct path read                  3,464,056       6,593      2   33.5 User I/O
DB CPU                                            3,503          17.8
db flash cache single block ph    2,293,604       3,008      1   15.3 User I/O
db file sequential read             200,779       2,294     11   11.6 User I/O
enq: TC - contention                     82       1,571  19158    8.0 Other presentation October 27 2011

For all of those who aren’t tired of listening to me yet there is good news: I am presenting a webinar at on October 27th 2011. I will most likely be around 17:00 UK time as the meetings start 09:00 PST. I agreed with the committee that we have performed a lot of nitty-gritty down to the very low level, and should probably do a more high level overview presentation as well. As it happens, I am starting my seminars with exactly that!

For your convenience the abstract and summary are shown below-hoping to see you online.

An Introduction to Oracle High Availability

This introductory level session aims at providing an overview of Oracle High Availability options to users of traditional single instance Oracle deployments who are interested in ways to make their database more highly available.

I’m No Longer An Oracle ACE But Even I Know This: SPARC SuperCluster Will “Redefine Information Technology.” Forget Best Of Breed (Intel, EMC, VMware, Etc).

Before Oracle recruited me in 2007, to be the Performance Architect in the Exadata development organization, I was an Oracle ACE. As soon as I got my Oracle employee badge I was surprised to find out that I was removed from the roles of the Oracle ACE program. As it turned out Oracle Employees could not hold Oracle ACE status. Shortly thereafter, the ACE program folks created the Oracle Employee ACE designation and I was put into that status. In March 2011 I resigned from my role in Exadata development to take on the challenge of Performance Architect in the Data Computing Division of EMC focusing on the Data Computing Appliance and Greenplum Database.

Oracle Expertise Within EMC
Knowing a bit about Oracle means that I’m involved in Oracle-related matters in EMC. That should not come as a surprise since there are more blocks of Oracle data stored on EMC storage devices than any other enterprise-class storage. So, while I no longer focus on Exadata I remain very involved in Oracle Database matters in EMC—in at least an oblique fashion. So you say, “Remind me what this has to do with SPARC SuperCluster.” Please, read on.

So, my status in the Oracle ACE program has gone from non-ACE to ACE to non-ACE to ACE to non-ACE. It turns out that readers of this blog have noticed that fact. Not just two weeks ago I received email from a reader with the following quote:

Kevin, I read your blog for many years. I really like learning about system and storage topics and Oracle. You are not an Oracle ACE so I want you to remove the logo from you (sic) front page

I responded in agreement to the reader and am about to remove the Oracle ACE logo from the front page. She is right and I certainly don’t want to misrepresent myself.

Ace Director
Some of my fellow OakTable Network members started the paperwork to refer me into ACE Director status. They needed me to supply some information for the form but before I filled it out I read the ACE Director requirements. As ACE Director I would be required to speak at a certain number of conferences, or other public forums, covering material that helps Oracle customers be more successful with Oracle products. I gave that some thought. I certainly have no problems doing that—and indeed, I have done that and continue to do that. But, Oracle has acquired so many companies that no matter where I decided to go after leaving Oracle I couldn’t avoid working for a company that Oracle views as competition. To put it another way, Oracle views everyone in the enterprise technology sector as competition and everyone in return views Oracle as co-opetition or competition.

In my assessment, Oracle’s acquisitions have moved the company into a co-opetitive posture where companies like EMC are concerned. EMC and Oracle share customers. Conversely, EMC shares customers with all of Oracle’s software competitors as well. That’s the nature of industry consolidation. What does this have to do with the ACE program? Well, my current role in EMC will not be lending itself to many public speaking opportunities—at least not in the foreseeable future. For that, and a couple other reasons, I decided not to move forward with the ACE Director nomination put in motion by my fellow OakTable cadre. And, no, I haven’t forgot that this post is about SPARC SuperCluster goodness.

Oracle dominates the database market today. That is a fact. Oracle got to that position because choosing Oracle meant no risk of hardware lock-in. Remember “Open Systems?” Oracle was ported and optimized for a mind-boggling number of hardware/operating system platforms. I was a part of that for 10 years in my role within Sequent Computer System’s database engineering group.

This is all fresh in mind because I had dinner with one of the Vice Presidents in Oracle Server Technology just three nights ago. We’ve been friend for many years–about 15 or so if I recall. When we get together we enjoy discussing what’s going on in the IT industry today while wincing over the fact that the industry in general seems to enjoy “re-discovery” of how to solve problems that we already solved at least once over the period of our relationship. That’s just called getting old in a fast-paced industry.

So, while I’m no longer in the Oracle ACE program I can still enjoy putting aside my day job as co-opetitor-at-large (my role at EMC) and enjoy the company of friends—especially with those of us who, in one way or another, helped Oracle become the dominant force in open systems database technology.

Your Best Interest In Mind: SPARC?
With the topics from my dinner three nights ago in mind, and my clean-slate feeling regarding my status in the Oracle ACE program, I sit here scratching my head and pondering current IT industry events. Consider the meltdown of Hewlett-Packard (I could have wiped out 50% of HP’s market cap for less than 25 million dollars and I speak a good bit of Deutsch to boot), Larry-versus-Larry, Oracle’s confusion over the fact that Exadata is in fact commodity x86 servers) and how, on September 26 2011, we get the privilege of hearing how a has-been processor architecture (SPARC) in the latest SuperCluster offering is going to “redefine the IT industry.”

Redefine the IT industry? Really? Sounds more like open systems lock-in to me.

I personally think cloud computing is more likely to redefine the IT industry than some SPARC-flavored goodies. That point of view, as it turns out, is just another case where a non-Oracle ACE co-opetitor like me disagrees with Oracle executives. Indeed, could the pro-cloud viewpoint I share with EMC and VMware be any different from that of Oracle corporation’s leadership? Does anyone remember the following quote regarding Oracle Corporation’s view of the cloud?

What is it? It’s complete gibberish. It’s insane. When is this idiocy going to stop?

We’ll make cloud computing announcements. I’m not going to fight this thing. But I don’t understand what we would do differently in the light of cloud.

Don’t understand what to do in light of cloud computing? Is that a mystery? No, it’s called DBaaS and vFabric Data Director is most certainly not just one of those me-too “cloud computing announcements” alluded to in the quote above.

Life Is A Series Of Choices
You (IT shops) can choose to pursue cloud computing. You can choose x86 or SPARC stuff. You can choose to fulfill your x86 server sourcing requirements from a vendor committed to x86 or not.  You can fulfill your block and file storage requirements with products from a best of breed neutral storage vendor or not.  And, finally, you can choose to read this blog whether or not I hold Oracle ACE program status. I’d prefer you choose the former rather than the latter.

By the way, Oracle announced the SuperCluster about 9 months ago:

I lost my Oracle ACE designation because I became an Oracle employee, SPARC Supercluster isn’t going to redefine anything and I still remember the real definition of “Open Systems.” I also know, all to well, what the term co-opetition means.

Filed under: Exadata, Exadata Database Machine, oracle, Oracle Versus Intel, SPARC Supercluster, vFabric Data Director Tagged: OpenWorld 2011

Critical Skills for Performance Work

I was just watching John Rauser’s keynote “What is a Career in Big Data?” from last weeks Strata Conference New York and I have to say it’s an amazing talk. I would highly recommended it to anyone who does any type of data analysis, including any type of performance analysis.

I found many of the “critical skill” points John made to have a strong correlation to performance analysis work as well. Some quotations that really stand out to me:

On writing:

“[writing]…it’s the first major difference between mediocrity and greatness.” [10:39]

“If it isn’t written down, it never happened…if your writing is so opaque that people can not understand your work, then you may as well never have never done it.” [10:50]

On skepticism:

Rebuilding Indexes and the Clustering Factor Solution (Move On)

Excellent !! This quiz created quite a bit of debate and it was nice to sit back and read some interesting discussions. The Clustering Factor is basically the measurement of how well aligned the data in the underlining table is in relation to the index and is the number of table related I/Os required to read the entire table via a [...]

CanWIT Panel — CIO or CTO? The Path to Next Generation Technology Leadership

Last Thursday I was invited to the panel organized by Ottawa Chapter of Canadian Women In Technology (CanWIT). I wanted to mention it here as CanWIT sets up very interesting events for women in IT so if you are interested in progressing your IT career, definitely consider their events. The panel was designed to share [...]

Ksplice in action

On July 21, 2011 Oracle announced that it has aquired Ksplice. With Ksplice users can update the Linux kernel while it is running, so without a reboot or any other disruption. As of September 15, 2011 Ksplice is available, at no additional charge, to new and existing Oracle PremierSupport customers on the Unbreakable Linux Network […]

Installing Grid Infrastructure on Oracle Linux 6.1 with kernel UEK

Installing Grid Infrastructure on Oracle Linux 6.1

Yesterday was the big day, or the day Oracle release for Linux x86 and x86-64. Time to download and experiment! The following assumes you have already configured RAC 11g Release 2 before, it’s not a step by step guide how to do this. I expect those to shoot out of the grass like mushrooms in the next few days, especially since the weekend allows people to do the same I did!

The Operating System

I have prepared a xen domU for, using the latest Oracle Linux 6.1 build I could find. In summary, I am using the following settings:

  • Oracle Linux 6.1 64-bit
  • Oracle Linux Server-uek (2.6.32-100.34.1.el6uek.x86_64)
  • Initially installed to use the “database server” package group
  • 3 NICs – 2 for the HAIP resource and the private interconnect with IP addresses in the ranges of and The public NIC is on
    • Node 1 uses 192.168.(99|100|101).129 for eth0, eth1 and eth2. The VIP uses
    • Node 1 uses 192.168.(99|100|101).131 for eth0, eth1 and eth2. The VIP uses
    • The SCAN is on 192.168.99.(133|134|135)
    • All naming resolution is done via my dom0 bind9 server
  • I am using a 8GB virtual disk for the operating system, and a 20G LUN for the oracle Grid and RDBMS homes. The 20G are subdivided into 2 LVMs of 10G each mounted to /u01/app/oracle and /u01/crs/ Note you now seem to need 7.5 G for GRID_HOME
  • All software is owned by Oracle
  • Shared storage is provided by the xen blktap driver
    • 3 x 1G LUNs for +OCR containing OCR and voting disks
    • 1 x 10G for +DATA
    • 1 x 10G for +RECO

Configuring Oracle Linux 6.1

Installation of the operating environment is beyond the scope of this article, and it hasn’t really changed much since 5.x. All I did was to install the database server package group. I wrote this article for fans of xen-based para-virtualisation. Although initially for 6.0, it applies equally for 6.1. Here’s the xen native domU description (you can easily convert that to xenstore format using libvirt):

# cat node1.cfg
extra=" "
bootloader = "pygrub"

Use the “xm create node1.cfg” command to start the domU. After the OS was ready I installed the following additional software to satisfy the installation requirements:

  • compat-libcap1
  • compat-libstdc++-33
  • libstdc++-devel
  • gcc-c++
  • ksh
  • libaio-devel

This is easiest done via yum and the public YUM server Oracle provides. It also has instructions on how to set your repository up.

# yum install compat-libcap1 compat-libstdc++-33 libstdc++-devel gcc-c++ ksh libaio-devel

On the first node only I wanted a VNC-like interface for a graphical installation. The older package vnc-server I loved from 5.x isn’t available anymore, the package you need is now called tigervnc-server. It also requires a new viewer to be downloaded from sourceforge. On the first node you might want to install these, unless you are brave enough to use a silent installation:

  • xorg-x11-utils
  • xorg-x11-server-utils
  • twm
  • tigervnc-server
  • xterm

Ensure that SELinux and the IPTables packages are turned off. SELinux is still configured in /etc/sysconfig/selinux, where the setting has to be permissive at least. You can use “chkconfig iptables off” to disable the firewall service at boot. Check that there are no filter rules using “iptables -L”.

I created the oracle account using these usual steps-this hasn’t change since

A few changes to /etc/sysctl.were needed; you can copy and paste the below example and append it to your existing settings. Ensure to up the limits where you have more resources!

kernel.shmall = 4294967296
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 6815744
net.core.rmem_default = 262144
net.core.wmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_max = 1048576
fs.aio-max-nr = 1048576
net.ipv4.ip_local_port_range = 9000 65500
net.ipv4.conf.eth1.rp_filter = 0
net.ipv4.conf.eth2.rp_filter = 0

Also ensure that you change the rp_filter for your private interconnect to 0 (or 2)-my devices are eth1 and eth2. This is a new requirement for reverse path filtering introduced with

ASM “disks” must be owned by the GRID owner. The easiest way to change the permissions of the ASM disks is to create a new set of udev rules, such as the following:

# cat 61-asm.rules
 KERNEL=="xvd[cdefg]1", OWNER="oracle", GROUP="asmdba" MODE="0660"

After a quick “start_udev” as root these were applied.

Note that as per my domU config file I actually know the device names are persistent, so it was easy to come up with this solution. In real life you would use the dm-multipath package which allows setting the owner,group and permission now in /etc/multipath.conf for every ASM LUN.

There was an interesting problem initially in that kfod seemed to trigger a change of permissions back to root:disk whenever it ran. Changing the ownership back to oracle only lasted until the next execution of kfod. The only fix I could come up with involved the udev rules.

Good news for those who suffered from the multicast problem introduced in now knows about it and checks during the post hwos stage (I had already installed cvuqdisk):

[oracle@rac11203node1 grid]$ ./ stage -post hwos -n rac11203node1

Performing post-checks for hardware and operating system setup

Checking node reachability...
Node reachability check passed from node "rac11203node1"

Checking user equivalence...
User equivalence check passed for user "oracle"

Checking node connectivity...

Checking hosts config file...

Verification of the hosts config file successful

Node connectivity passed for subnet "" with node(s) rac11203node1
TCP connectivity check passed for subnet ""

Node connectivity passed for subnet "" with node(s) rac11203node1
TCP connectivity check passed for subnet ""

Node connectivity passed for subnet "" with node(s) rac11203node1
TCP connectivity check passed for subnet ""

Interfaces found on subnet "" that are likely candidates for VIP are:
rac11203node1 eth0:

Interfaces found on subnet "" that are likely candidates for a private interconnect are:
rac11203node1 eth1:

Interfaces found on subnet "" that are likely candidates for a private interconnect are:
rac11203node1 eth2:

Node connectivity check passed

Checking multicast communication...

Checking subnet "" for multicast communication with multicast group ""...
Check of subnet "" for multicast communication with multicast group "" passed.

Checking subnet "" for multicast communication with multicast group ""...
Check of subnet "" for multicast communication with multicast group "" passed.

Checking subnet "" for multicast communication with multicast group ""...
Check of subnet "" for multicast communication with multicast group "" passed.

Check of multicast communication passed.
Check for multiple users with UID value 0 passed
Time zone consistency check passed

Checking shared storage accessibility...

Disk                                  Sharing Nodes (1 in count)
------------------------------------  ------------------------
/dev/xvda                             rac11203node1
/dev/xvdb                             rac11203node1
/dev/xvdc                             rac11203node1
/dev/xvdd                             rac11203node1
/dev/xvde                             rac11203node1
/dev/xvdf                             rac11203node1
/dev/xvdg                             rac11203node1

Shared storage check was successful on nodes "rac11203node1"

Post-check for hardware and operating system setup was successful.

As always, I tried to fix as many problems before invoking runInstaller as possible. The “-fixup” option to runcluvfy is again very useful. I strongly recommend running the fixup script prior to executing the OUI binary.

The old trick to remove /etc/ntp.conf causes the NTP check to complete ok, in which case you are getting the ctsd service for time synchronisation. You should not do this in production-consistent times in the cluster are paramount!

I encountered an issue with the check for free space later in the installation during my first attemps. OUI wants 7.5G for GRID_HOME, even though the installation “only” took around 3 in the end. I exported TMP and TEMP to point to my 10G mount point to avoid this warning:

$ export TEMP=/u01/crs/temp
$ export TMP=/u01/crs/temp
$ ./runInstaller

The installation procedure for Grid Infrastructure is almost exactly the same as for, except for the option to change the AU size for the initial disk group you create:

Once you have completed the wizard, it’s time to hit the “install” button. The magic again happens in the file, or if you are upgrading. I included the output so you have something to compare against:

Performing root user operation for Oracle 11g

The following environment variables are set as:
ORACLE_HOME=  /u01/crs/

Enter the full pathname of the local bin directory: [/usr/local/bin]: Creating y directory...
Copying dbhome to y ...
Copying oraenv to y ...
Copying coraenv to y ...

Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/crs/
Creating trace directory
User ignored Prerequisites during installation
OLR initialization - successful
root wallet
root wallet cert
root cert export
peer wallet
profile reader wallet
pa wallet
peer wallet keys
pa wallet keys
peer cert request
pa cert request
peer cert
pa cert
peer root cert TP
profile reader root cert TP
pa root cert TP
peer pa cert TP
pa peer cert TP
profile reader pa cert TP
profile reader peer cert TP
peer user cert
pa user cert
Adding Clusterware entries to upstart
CRS-2672: Attempting to start 'ora.mdnsd' on 'rac11203node1'
CRS-2676: Start of 'ora.mdnsd' on 'rac11203node1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac11203node1'
CRS-2676: Start of 'ora.gpnpd' on 'rac11203node1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac11203node1'
CRS-2672: Attempting to start 'ora.gipcd' on 'rac11203node1'
CRS-2676: Start of 'ora.gipcd' on 'rac11203node1' succeeded
CRS-2676: Start of 'ora.cssdmonitor' on 'rac11203node1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac11203node1'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac11203node1'
CRS-2676: Start of 'ora.diskmon' on 'rac11203node1' succeeded
CRS-2676: Start of 'ora.cssd' on 'rac11203node1' succeeded

ASM created and started successfully.

Disk Group OCR created successfully.

clscfg: -install mode specified
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
CRS-4256: Updating the profile
Successful addition of voting disk 1621f2201ab94f32bf613b17f62982b0.
Successful addition of voting disk 337a3f0b8a2d4f7ebff85594e4a8d3cd.
Successful addition of voting disk 3ae328cce2b94f3bbfe37b0948362993.
Successfully replaced voting disk group with +OCR.
CRS-4256: Updating the profile
CRS-4266: Voting file(s) successfully replaced
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   1621f2201ab94f32bf613b17f62982b0 (/dev/xvdc1) [OCR]
2. ONLINE   337a3f0b8a2d4f7ebff85594e4a8d3cd (/dev/xvdd1) [OCR]
3. ONLINE   3ae328cce2b94f3bbfe37b0948362993 (/dev/xvde1) [OCR]
Located 3 voting disk(s).
CRS-2672: Attempting to start 'ora.asm' on 'rac11203node1'
CRS-2676: Start of 'ora.asm' on 'rac11203node1' succeeded
CRS-2672: Attempting to start 'ora.OCR.dg' on 'rac11203node1'
CRS-2676: Start of 'ora.OCR.dg' on 'rac11203node1' succeeded
CRS-2672: Attempting to start 'ora.registry.acfs' on 'rac11203node1'
CRS-2676: Start of 'ora.registry.acfs' on 'rac11203node1' succeeded
Configure Oracle Grid Infrastructure for a Cluster ... succeeded

That’s it! After returning to the OUI screen you run the remaing assistants and finally are rewarded with the success message:

Better still, I could now log in to SQL*Plus and was rewarded with the new version:

$ sqlplus / as sysasm

SQL*Plus: Release Production on Sat Sep 24 22:29:45 2011

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL> select * from v$version;

Oracle Database 11g Enterprise Edition Release - 64bit Production
PL/SQL Release - Production
CORE      Production
TNS for Linux: Version - Production
NLSRTL Version - Production



You might remark that in the output there has only ever been one node referenced. That is correct-my lab box has limited resources and I’d like to test the script for each new release so please be patient! I’m planning an article about upgrading to soon, as well as the addition of a node. One thing I noticed was the abnormally high CPU usage for the CSSD processes: ocssd.bin, cssdagent and cssdmonitor-something I find alarming at the moment.

top - 22:53:19 up  1:57,  5 users,  load average: 5.41, 4.03, 3.77
Tasks: 192 total,   1 running, 191 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.3%us,  0.2%sy,  0.0%ni, 99.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   4102536k total,  3500784k used,   601752k free,    59792k buffers
Swap:  1048568k total,     4336k used,  1044232k free,  2273908k cached

27646 oracle    RT   0 1607m 119m  53m S 152.0  3.0  48:57.35 /u01/crs/
27634 root      RT   0  954m  93m  55m S 146.0  2.3  31:45.50 /u01/crs/
27613 root      RT   0  888m  91m  55m S 96.6  2.3  5124095h /u01/crs/
28110 oracle    -2   0  485m  14m  12m S  1.3  0.4   0:34.65 asm_vktm_+ASM1
28126 oracle    -2   0  499m  28m  15m S  0.3  0.7   0:04.52 asm_lms0_+ASM1
28411 root      RT   0  500m 144m  59m S  0.3  3.6  5124095h /u01/crs/ -M -d /u01/crs/
32394 oracle    20   0 15020 1300  932 R  0.3  0.0  5124095h top
1 root      20   0 19336 1476 1212 S  0.0  0.0   0:00.41 /sbin/init

... certainly didn’t use that much CPU across 4 cores…

Update: I have just repeated the same installation on VirtualBox 4.1.2 with less potent hardware, and funny enough the CPU problem has disappeared. How is that possible? I need to understand more, and maybe update the XEN host to something more recent.

Direct I/O for Solaris benchmarking

ZFS doesn’t have direct I/O.
Solaris dd doesn’t have a iflag= direct.

Thus for I/O benchmarking it requires mounting and umounting the file system between tests for UFS and for ZFS exporting and re-importing the pools.

But there is a trick. Reading off of /dev/rdsk will by pass the cache.

Here is a simple piece of code that will benchmark the disks. The code was put together by George Wilson and Jeff Bonwick (I beleive)

disks=`format < /dev/null | grep c.t.d | nawk '{print $2}'`
       ptime dd if=/dev/rdsk/${1}s0 of=/dev/null bs=64k count=1024 2>&1 |
           nawk '$1 == "real" { printf("%.0f\n", 67.108864 / $2) }'
       for iter in 1 2 3
               getspeed1 $1
       done | sort -n | tail -2 | head -1
for disk in $disks
       echo $disk `getspeed $disk` MB/sec