Top 60 Oracle Blogs

Recent comments

February 2011

True Grit…

I watched the original True Grit (1969) film a few times as a kid, so I wasn’t that bothered about True Grit (2010). Then I saw a bunch of reviews saying it was a reworking from the book, rather than a remake of the original film blah, blah, blah, Coen Brothers, blah, blah, blah…

I’m sure if I watched both films side by side I would say there are a lot of differences and the latest version was a better film, but as it is I came away feeling rather underwhelmed. For the new audience I think it will be pretty cool. For those like me who watched the old version multiple times over the years I just don’t see the appeal. As for Oscar nominations, the girl was good, but Jeff Bridges seemed decidedly average to me.

Must remember, remakes are for a new audience, not miserable old gits…



ASMLib and OCFS2 for RHEL6? I don’t think so…

I was just scanning through some stuff on MOS when I came across a couple of RHEL6 tidbits.

  • Doc ID 1089399.1: ”For RHEL6 Oracle will only provide ASMLib software and updates via Unbreakable Linux Network(ULN). Oracle will no longer provide ASMLib packages for Red Hat kernels.”
  • Doc ID 1253272.1: “Starting with RHEL6, Oracle will provide OCFS2 software via ULN only. ULN requires an Oracle Linux support subscription, even for those customers using OCFS2 just to store database files.”

The OCFS2 thing doesn’t phase me. I only need a cluster file system for a few shared directories when I’m doing RAC and using OCFS2 and RAC together is a disaster, so I never use OCFS2 these days.

The ASMLib issue is a bit more interesting because it is still the recommended approach in the documentation. A recent thread on the OakTable mailing list about ASMLib resulted in most replies saying to avoid ASMLib completely and use udev instead. I don’t mind ASMLib myself, but I guess this is another nail in the ASMLib coffin. I can’t see me bothering to use ASMLib again now.



DB Optimizer diagrams 100 tables

People ask me what the limit is to the number of tables  in a SQL statement  that DB Optimizer can handle when tuning.  Above is a tiny thumbnail for a 100 table SQL statemement. The SQL actually only has a select from one view  but that view expands to 100 different tables. Pretty awesome.

Scientific Method

Troubleshooting on an 8 node cluster

It seems I am doing a lot of fixing broken stuff recently. So this time I have been asked to repair a broken 8 node RAC cluster on OEL 5.5 with Oracle RAC The system has been moved into a different, more secure network, and its firewalls prevented all access to the machines except for ILO. Another way of “security through obscurity”. The new network didn’t allow any clients to connect to any of the 8 node RAC which means that it is actually quite expensive kit to sit idle. The cluster is not in production, it’s still being build to specification but this accessibility problem has been a holdup to the project for a little while now. Yesterday has been a breakthrough-the netops team found an error to their configuration and for the first time the hosts could be accessed via ssh. Unfortunately for me that access is possible via audited gateways using PowerBroker to which I don’t have access.An alternative was the ILO interface which has not yet been hardened to production standards. So after some discussion internally I was given the ILO access credentials. This is good and bad: good, because it was a thoroughly broken system, and bad because there is no copy and paste with a java based console. And if that wasn’t bad enough, I had to contend myself with 80×24 characters on the console (however in very big letters). I pretty much needed all of my 24″ screen to display it. But I digress.

When logging on, I found the following situation:

  • Only 1 out of 8 nodes had OHAS/CRSD started. The others were still down, a kernel upgrade has taken place, but the asmlib kernel module hasn’t been upgraded at the same time. The first node had the correct RPM installed and ASMLib has done its magic on this node
  • Clusterware’s lower stack was up. However the and all resources depending on it (listener, scan, scan listener, etc) were down. Not a single byte went over the public interconnect. That was strange.

Running /sbin/ifconfig has been a dream on this machine – I saw all 3 SCAN IPs on it, and all 8 node virtual IP addresses. Plus it has 6 NICs for Oracle, bonded into pairs of 2. And this is exactly where the confusion starts. I found the following bonded interfaces defined:

  • bond0
  • bond1.251
  • bond0.212

It took a while to figure out why these interfaces were named as they were, but apparently the suffix is a VLAN name. It also filtered through that one of my colleagues has tried to replace the previously used bond0.212 with bond0 as the public interconnect. He was however not successful in doing so, leaving the cluster in the state it was in.

He used the following commands to update the public interface:

$ oifcfg getif
bond1.251  global  cluster_interconnect
bond0  10.2xxx8.0  global  public

He also changed the vip configuration, with the result shown here:

srvctl config vip -n node11
VIP exists: /node1-vip/10.2xx8.13/10.2xx8.0/, hosting node node11


The VIP however remained unimpressed:

srvctl start vip -n node1
PRCR-1079 : Failed to start resource
CRS-2674: Start of '' on 'node1' failed
CRS-2632: There are no more servers to try to place resource '' on that would satisfy its placement policy

That’s where I have been asked to cast a keen eye over the installation.

The Investigation

First of all I could find nothing wrong with what has been done so far. So starting my investigation I first thought there was something wrong with the public network so I decided to shut it down:

# ifdown bond0

I then checked the network configuration of /etc/sysconfig/network-scripts. The setting is shown here:


bonding_opts="use_carrier=0 miimon=0 mode=1 arp_interval=10000 primary=eth0"





The MAC addresses of ifcfg-eth* matched the output from the ifconfig command. In the lab I occasionally have the problem that my configurartion files don’t match the real MAC addresses and therefore my NICs don’t come up. But this wasn’t the case here.

I then checked if the kernel module is loaded correctly. Usually you’d find that in /etc/modprobe.conf but there was not entry. I added these lines as per the documentation:

alias bond0 bonding
alias bond1 bonding
alias bond1.251 bonding

With that all done I brought the bond0 interface back up (don’t ever try to bring down the private interconnect-it will cause a node eviction!). Still nothing. The output of crsctl status resource -t remained “OFFLINE” for resource BTW, you cannot manually start that a network resource using srvctl (it’s an ora.* resource so don’t even think about trying crsctl start resource :). All you can do with a network resource is to get its configuration (srvctl config network -k 1…) and modify it (srvctl modify network -k 1…)

ORAROOTAGENT is responsible for starting the network, and it will try to do so every second or so. That’s CRSD’s ORAROOTAGENT by the way, the log file is in $GRID_HOME/log/`hostname -s`/agent/crsd/orarootagent_root/orarootagent_root.log.

After the modification to bond0 I could now ping the IP associated with bond0 so at least that was a success. One thing I learned that day is that the MAC address of the bonded NIC matches the primary eth* interface’s NIC, in my case it was that of eth0, i.e. f4:ce:46:87:fa:d0. If one of the enslaved NICs failed it would probably assume the failback NIC’s MAC address. So in summary:

  • the network bonding was correctly configured
  • I could ping bond0

At this point I could see no reason why starting of the network failed. Maybe a typo in the configuration? The network configuration can be queried with 2 commands: oifcfg and servctl config network. So I tried oifcfg first.oifcfg getif returns:

bond0 10.xx.x2.0           "good"
bond0 10.xx.x8.0           "old/bad"
bind1.251 172.xx.xx.160    interconnect

Hmmm, where’s that second bond0 interface from? The bond1.251 interface is in use and working, the IP matches the IP address assigned in ifcfg-bon1.251. The second entry for bind1.251 is created by the HAIP resource and has to do with the high available cluster interconnect which uses multicasting for communication (to the frustration of many users who upgraded to only to find out that the lower stack doesn’t start on the second and other nodes).

So to be sure that I was seeing something unusual I compared the output with another node on the cluster. There I found I only have 3 interfaces …. bond0 and bond1 + the UDP multicast address. I initially tried to remove the bad network with oifcfg delif but that didn’t work. I then verified the output of srvctl config network to see if it matched what I expected to. And here was a surprise: the output of the network listed a wrong subnet mask. Instead of (note the “254″!) i found That was easy to fix and while I was back again trying to delete the old network using oifcfg I suddenly realised that the cluster has sprung back into life. Small typo-big consequences! Finally all the resources depending on were started, including SCAN VIPs, SCAN listeners, listeners, VIPs…

References for NIC bonding on RHEL5

Where’s my money gone? Update…

Followers of the blog will know I’ve had a little trouble with Oracle Norway randomly taking money off me for no reason.

Today I got the money refunded, but there was a snag. I was refunded less than the amount that was taken. I’m guessing this has something to do with exchange rates etc. So as it stands I am about £40 out of pocket, which is significantly better than the several thousand pounds I was before.

As you can image, I sent an email off this morning asking for the missing money. Let’s see how quickly that is dealt with.



On the Topic of Copyright

February 11, 2011 (Updated February 25, 2011) On August 26, 2010, just before cancelling my first three month old order for the “Oracle Tuning: The Definitive Reference Second Edition” book, I wrote the following in a blog article (side note: the second order for the book is now more than three months old, and I have [...]

Nokia signs its own death warrant?

I’m really not sure what to think about the collaboration between Nokia and Microsoft. Prior to my recent switch to HTC, I’ve always used Nokia phones, so I have a soft spot for the company, but this recent announcement has me in two minds.

My first reaction was this move is a complete disaster for Nokia and a big bonus for Microsoft. Nokia ships a serious amount of phones, so Microsoft will quickly get some impressive numbers, which is great for them, but what do Nokia get out of it? Currently it seems they get a mobile platform that nobody really wants or cares about.

My second reaction was maybe this is the right move for Nokia. Symbian has a lot of the market share at the moment, but it is going down hill very quickly. They need to make a move, but where to go? If they go Android they will be just another manufacturer in the mix. They would be better off than they are now, but could they dominate this market? If they go Windows they could mark themselves out as the dominant force in this market. The other offerings in this space look rather weak. As Windows Phone develops, with Nokia’s help, maybe this could be a very attractive market.

Of course, only time will tell, but I know one thing. As the mobile OS market currently stands, I won’t be buying a Nokia phone running Windows Phone.



Direct NFS (DNFS) Clonedb…

A bit before Christmas I got an email from Kevin Closson asking me to take a look at a new undocumented Direct NFS (DNFS) feature in the patch set. I think he wanted to see what a regular DBA would think of it. What with Christmas and some family issues, I didn’t get too much done. As soon as I hit the first hurdle I kinda caved in and left if for the new year.

Well, January came and went, then I finally got round to looking at it again. I like to think my constant questions and dumb mistakes has helped to prepare Oracle for the sort of thing that will happen when other idiots like me are let loose on it. Anyway, the result of that little journey is documented here.

As I’ve said in the article, things are still in a state of flux and I will no doubt have to do some alterations once the My Oracle Support (MOS) Note 1210656.1 is released that will properly document it.

So as a regular DBA what do I think? I think it is awesome!

Old-style cloning of databases isn’t hard, but it’s boring and can take ages depending on the size of the database and storage being used. It’s one of those tasks that always makes me sigh, before I get off my ass and start it. Clonedb turns that on its head because it is really quick and simple. There is a bit of setup, but that is really going to be a one-time thing on most servers. You are doing your backups anyway, so there is no big deal there. Now you can just run a script and bang, you have a running clone.

I think this is going to please a lot of DBAs out there!




Warning – make sure you read to the end of this post.

Someone sent me an email this morning asking how Oracle calculates the index cardinality of an index range scan. Now, as I’ve often said, I don’t do private email about Oracle – unless it just happens to catch my attention and looks like something that is sufficiently generic to be worth publishing.

Today’s emailer was a lucky winner – he’d sent me a very short email that took about 30 seconds to read, contained a significant error, and (at first sight) probably had the right information in it for me to work with. Here’s the problem, as a cut-n-paste from an SQL*Plus session:

SQL> explain plan for select * from admlxa.QRT_BENCH where QRT_BENCH_DATE < :a3;

Elapsed: 00:00:00.01 

SQL> select * from table(dbms_xplan.display); 

Plan hash value: 2896103184 

| Id  | Operation                   | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT            |                 |   424K|    81M|  3170   (4)| 00:00:05 |
|   1 |  TABLE ACCESS BY INDEX ROWID| QRT_BENCH       |   424K|    81M|  3170   (4)| 00:00:05 |
|*  2 |   INDEX RANGE SCAN          | IDX_QRT_BENCH_1 | 78876 |       |   303   (5)| 00:00:01 |

Predicate Information (identified by operation id):
   2 - access("QRT_BENCH_DATE"<:A3) 

Look closely at the “Rows” column – there’s clearly a logic error appearing here. If you select 78,876 rowids from an index you can’t possibly acquire 424,000 rows from the table – so where have those two numbers come from ?

The supporting information looked like this:

num_rows for QRT_BENCH table = 8480798
num_distinct for QRT_BENCH_DATE column = 458 

num_rows for IDX_QRT_BENCH_1 = 8763975
distinct_keys for IDX_QRT_BENCH_1 = 537

Of course, I really needed to know whether this was a single-column or multi-column index if I wanted to model the problem correctly and do further checks on when the error appeared, but this was good enough to get started. We note, first, that the 424K for the table cardinality is just the standard “5% due to range predicate with unknown value”: round(0.05 * 8480798) = 424040.

    Step 1: since I don’t know where the number 78876 comes from, let’s try to work backwards – use it in a bit if arithmetic and see what drops out. Let’s try dividing it into the table cardinality: 424040 / 78876 = 5.3760

    Step 2: Does 5.376 look familiar — it may be a coincidence, but isn’t that really close to 1% of the number of distinct keys in the index ?

At this point I don’t have any time to investigate in detail, but a scratch hypothesis is that Oracle is calculating something like: 5 * (number of rows in table / (number of distinct keys in index)); and maybe that magic five appears through a piece of code that takes 5%, but for some reason then divides by the 1% associated with the selectivity normally associated with function(col).

If I had the time (and the data set) I’d start playing with dbms_stats.set_index_stats() to see the effect of changing num_distinct and num_rows to see if my initial guess was somewhere in the right ballpark. As it is I’ve just emailed this note back to the source.


Before I published this note I got a reply from the original correspondent, with the following comment:

Sure enough – five years ago I had published some details about exactly this “feature” – and that 537 vs. 5.376 really was just a coincidence.

I decided to publish the note anyway for three reasons –

    one: to make sure you realise that I do make mistakes
    two: to show you that simple games with numbers may give you a working hypothesis
    three: to remind you that once you’ve got a working hypothesis it’s often easy to think of ways to demonstrate that your hypothesis is wrong. (A couple of hacks of the statistics would have shown me a constant 0.009 appearing, rather than anything like 5% divided by 1%.)