Top 60 Oracle Blogs

Recent comments

September 2011

Right Practice (via James Morle’s Blog)

Great post from James Morle. Consider it required reading (and thinking).

Wow, it’s been a while since I wrote a post, sorry about that! I thought that I would take a brief break from the technical postings and espouse some opinion on something that has been bothering me for a while – ‘Best Practices.’ Best Practices have been around a long time, and started with very good intentions. In fact, one could easily claim that they are still produced with good intentions: To communicate methods that the hardware and software … Read More

via James Morle’s Blog

Friday Philosophy – Why doesn’t Agile work?

Why doesn’t Agile Development Methodology seem to work?

I’m going say right here at the start that I like much of what is in Agile, for many, many years I’ve used aspects of Rapid Application Development {which Agile seems to have borrowed extensively from} to great success. However, after my post last week on database design, many of the comments were quite negative about Agile – and I had not even mentioned it in my post!

To nail my flag to the post though, I have not seen an Agile-managed project yet that gave me confidence that Agile itself was really helping to produce a better product, a product more quickly and most certainly not a final system that was going to be easy to maintain. Bring up the topic of Agile with other experienced IT people and I would estimate 90% of the feedback is negative.

That last point about ongoing maintenance of the system is the killer one for me. On the last few projects I have been on where the culture was Agile-fixated I just constantly had this little voice in my head going:

“How is anyone going to know why you did that in six months? You’ve just bolted that onto the side of the design like a kludge and it really is a kludge. When you just said in the standup meeting that we will address that issue ‘later’, is that the same “later” that accounts for the other half-dozen issues that seem to have been forgotten?”.

From what I can determine after the fact, that voice turns out to be reason screaming out against insanity. A major reason Agile fails is that it is implemented in a way that has no consideration for post-implementation.

Agile, as it is often implemented, is all about a headlong rush to get the job done super-quick. Ignore all distractions, work harder, be completely focused and be smarter. It really does seem to be the attitude by those who impose Agile that by being Agile your staff will magically come up with more innovative solutions and will adapt to any change in requirements just because they work under an agile methodology. Go Agile, increase their IQ by 10 points and their work capacity by 25%. Well, it doesn’t work like that. Some people can in fact think on their feet and pull solutions out of thin air, but they can do that irrespective of the methodology. People who are more completer-finishers, who need a while to change direction but boy do they produce good stuff, have you just demoralized and hamstrung them?Agile does not suit the way all people work and to succeed those people it does not suit need to be considered.

The other thing that seems to be a constant theme under Agile is utterly knackered {sorry, UK slang, knackered means tired, worn out and a bit broken} staff. Every scrum is a mad panic to shove it all out of the door and people stop doing other things to cope. Like helping outside the group or keeping an eye on that dodgy process they just adopted as it needed doing. Agile fails when it is used to beat up team. Also, I believe Agile fails when those ‘distractions’ are ignored by everyone and work that does not fall neatly into a scrum is not done.

I suppose it does not help that my role has usually been one that is more Production Support than development and Agile is incompatible with production support. Take the idea of the scrum, where you have x days to analyse, plan, design, unit test and integrate the 6 things you will do in this round. On average I only spend 50% of my time dealing with urgent production issues, so I get allocated several tasks. Guess what, if I end up spending 75% of my time that week on urgent production issues, and urgent production issues have to take priority, I can screw up the scrum all on my own. No, I can’t pass my tasks onto others in the team as (a) they are all fully assigned to their tasks and (b) passing over a task takes extra time. Agile fails when it is used for the wrong teams and work type.

I’ve come to the conclusion that on most projects Agile has some beneficial impact in getting tasks done, as it forces people to justify what they have done each and every day, encourages communication and gives the developers a reason to ignore anything else that could be distracting them as it is not in the scrum. Probably any methodology would help with all of that.

My final issue with Agile is the idiot fanatics. At one customer site I spent a while at, they had an Agile Coach come around to help the team to become more agile. I thought this was a little odd as this team was actually doing a reasonable job with Agile, they had increased productivity and had managed to avoid the worst of the potential negative impacts. This man came along and patronisingly told us we were doing OK, but it was hard for us to raise our game like this, we just needed help to see the core values of Agile and, once we did, once we really believed in it, productivity would go up 500% {That is a direct quote, he actually said “productivity will go up by 500%”}. He was sparkly-eyed and animated and full of the granite confidence of the seriously self-deluded. I think he managed to put back the benefits of Agile by 50%, such was the level of “inspiration” he gave us. Agile fails when it is implemented like a religion. It’s just a methodolgy guys.

I find it all quite depressing as I strongly suspect that, if you had a good team in a positive environment, doing a focused job, Agile could reap great rewards. I’m assured by some of my friends that this is the case. {update – it took my good friend Mike less than an hour to chime in with a comment. I think I hit a nerve}.

Right Practice

Wow, it’s been a while since I wrote a post, sorry about that! I thought that I would take a brief break from the technical postings and espouse some opinion on something that has been bothering me for a while – ‘Best Practices.’

Best Practices have been around a long time, and started with very good intentions. In fact, one could easily claim that they are still produced with good intentions: To communicate methods that the hardware and software vendors recommend. However, the application of Best Practices has become increasingly abused in the field to the point where they have become more like prescriptions of how systems should be built. This has gone too far, and needs to be challenged.

10046 Extended SQL Tracing for a Session – What is Wrong with this Quote?

September 16, 2011 While reading the Oracle 11g Performance Tuning Recipes book, I noticed that some of the recipes in chapter 10 seemed to have an odd familiarity, although as of yet I have not quite pinpointed where I had previously seen some of the concepts in chapter 10.  Recipes 10-2, 10-11, and 10-16 seem to share a common ancestry with [...]

Collecting and analysing Exadata cell metrics

Recently I have been asked to write a paper about Exadata Flash Cache and its impact on performance. This was a challenge to my liking! I won’t reproduce the paper I wrote, but I’d like to demonstrate the methods I used to get more information about what’s happening in the software.


The Exadata Flash Cache is provided by four F20 PCIe cards in each cell. Currently the PCI Express bus is the most potent way to realise the potential of the flash disk in terms of latency and bandwidth. SSDs attached to a standard storage array will be slowed by fibre channel as the transport medium.

Each of the F20 cards holds 96G of raw space, totalling in 384GB of capacity per storage cell. The usable capacity is slightly less. The F20 card is subdivided into 4 so called FMODs, or solid state flash modules visible to the operating system using the standard SCSI SD driver.

Cellcli can also be used to view the FMODs using the “LIST PHYSICALDISK” command. The output is slightly different from the spinning disks as they are reported SCSI drivee’s [host:bus:target:lun] notation.

Now please don’t be tempted to take the FMODs and transform them into celldisks!

How it works

The flash cache is a write-through cache, and therefore won’t speed up write operations to disk. If an internal, undocumented algorithm decides that the data just written to disk is suitable for caching, it will put the data into the ESFC. It is quite clever in this respect and performs optimisations “ordinary” cache cannot perform.

Smart scans are going to ignore the flash cache by default and read directly from disk, unless the segment’s storage clause specifically instructs Oracle to read from flash cache as well. Meddling with the storage clause is also referred to as object pinning, a practice which is not recommended. Pinning objects to the flash cache immediately reminded me of the bad practice in the old 8i days of pinning objects in the shared pool. It didn’t necessarily made things go faster or help (except in some special cases).

The main statistics counter to query in relation to the smart cache is called “cell flash cache read hits”. The remainder of this paper has to do with the cell metrics which cannot be queried directly from the database. They are more granular though making them well suited for analysis.

The test case

The swingbench order entry benchmark has provided the data for the following text. The table ORDER_ITEMS contains most rows in the schema, and it is partitioned to reflect a realistic environment. Indexes have been made invisible as they may have impacted the performance statistics. Additionally cellcli’s WHERE clause isn’t very flexible and querying a number of objectIDs can be time consuming.

SOE.ORDER_ITEMS is divided into partitions with object IDs 74506 to 74570. These object IDs are required for mapping the output of cellcli to the data dictionary later.

A little bit of setup has been performed to capture the cell metrics. For pre and post comparison it was necessary to capture statistics in a statspack-like way and to analyse them. Rather than using Excel I decided to store the gathered information in the database in a dedicated schema.

I gathered two different sets of information from the cells: firstly the metrics for the objectType “FLASHCACHE”, and secondly the contents of the flash cache using the fabulous “dcli” command. The distributed command line interface can be used to execute arbitrary commands on a number of hosts simultaneously and reports the results back to the caller. I am redirecting the output from STDOUT to a CSV file in my home directory and use external tables to copy this information into the so-called “RESULT” table.

The table DDL is as follows:

create table cacheContents_csv (
cellname varchar2(50),
cachedKeepSize     number,
cachedSize         number,
hitCount           number,
missCount          number,
objectNumber             number
organization external (
type oracle_loader
default directory testdir
access parameters (
records delimited  by newline
fields  terminated by ';'
missing field values are null
location ('flashcontents.csv')
reject limit unlimited;

FROM cacheContents_csv A;

TYPE oracle_loader
"NAME" (14:36) CHAR,

"NAME"        VARCHAR2(50 BYTE),

I additionally added a table called “metricdefinition” which allows me to match the short metric name from metriccurrent with a more human readable description.

"NAME"        VARCHAR2(100 BYTE),
"UNIT"        VARCHAR2(20 BYTE)

The definitions, too, were loaded into the database using sqlldr.

First the metrics were captured:

# dcli -l root -g /tmp/martin_group 'cellcli -e "list METRICCURRENT where objecttype = \"FLASHCACHE\""' | sed -e 's/://' > /home/oracle/martin/metrics.csv

I admit it’s probably not the most elegant way to do so but it worked for me and I didn’t have weeks to prepare (as always…)

Now what does this command do? It will execute a call to cellcli on all nodes defined in /tmp/martin_group to list the current metrics for the Flash Cache. The result is then piped into sed which strips out any colons after the cell names and then saves the result in my TESTDIR directory on the first RDBMS node.

Similarly I am using the following command to get the contents of the flash cache to see if anything has changed after executing the query.

 # dcli -l root -g /tmp/martin_group 'cellcli -e "list flashcachecontent where dbUniqueName like 'TEST' and objectNumber > 74505 and objectNumber < 74571 attributes cachedKeepSize,cachedSize,hitCount,missCount,objectNumber"' | sed -e 's/://' | awk '{print $1";"$2";"$3";"$4";"$5";"$6}' > /home/oracle/martin/flashcontents.csv

This looks a little more complex, but it isn’t. The apparent complexity is required to limit the result set to the contents of the flash cache for database TEST and the object_ids of the table partitions. There is a little more transformation involved with AWK to convert the output in a “;” separated line.

With this done, it was possible to transfer the information in the staging tables into the RESULT tables. This is a simple “insert into table select testrun#, * from …CSV”.

The analysis I did was very simple once the data has been loaded into the RESULT tables. For the flashcache contents I used the following queries:

WITH prestats AS
( SELECT * FROM metriccurrent_result WHERE testrun# = &1
poststats AS
( SELECT * FROM metriccurrent_result WHERE testrun# = &2
SELECT pre.cellname,,
pre.metricvalue pre_metricvalue,
post.metricvalue post_metricvalue,
to_number(REGEXP_SUBSTR(post.metricvalue, '(\S*)'),'999G999G999D999999', 'NLS_NUMERIC_CHARACTERS = ''.,''') -
to_number(REGEXP_SUBSTR( pre.metricvalue, '(\S*)'),'999G999G999D999999', 'NLS_NUMERIC_CHARACTERS = ''.,''') delta
FROM prestats pre,
poststats post,
metricdefinition defn
WHERE pre.cellname = post.cellname
and =
and pre.objecttype = post.objecttype
and =
and = 'FC_IO_BYKEEP_R'
and (to_number(REGEXP_SUBSTR(post.metricvalue, '(\S*)'),'999G999G999D999999', 'NLS_NUMERIC_CHARACTERS = ''.,''') -
to_number(REGEXP_SUBSTR( pre.metricvalue, '(\S*)'),'999G999G999D999999', 'NLS_NUMERIC_CHARACTERS = ''.,''') ) <> 0
order by 1,2;

WITH prestats AS
( SELECT * FROM cachecontents_result WHERE testrun# = &1
poststats AS
( SELECT * FROM cachecontents_result WHERE testrun# = &2
SELECT pre.cellname,
pre.cachedkeepsize pre_cachedkeepsize,
pre.cachedsize pre_cachedsize,
pre.hitcount pre_hitcount,
pre.misscount pre_misscount,
post.cachedkeepsize post_cachedkeepsize,
post.cachedsize post_cachedsize,
post.hitcount post_hitcount,
post.misscount post_misscount,
(post.cachedkeepsize-pre.cachedkeepsize) delta_cachedkeepsize,
(post.cachedsize-pre.cachedsize) delta_cachedsize,
(post.hitcount-pre.hitcount) delta_hitcount,
(post.misscount-pre.misscount) delta_misscount
FROM prestats pre,
poststats post
WHERE pre.cellname   = post.cellname
AND pre.objectnumber = post.objectnumber

This revealed quite a lot about the underlying workings of the flash cache. The above queries are best executed with SQL Developer and the result is then copied and pasted into a spreadsheet application to be saved from accidental overwriting.


The above certainly isn’t bullet proof or very elegant, but a good starting point for your own experiments. Time permitting I would like to automate the generation of the CSV files via a scheduled job, and also the population of the result tables. There really should be some sort of unique key on the results tables to avoid duplicates. Maybe one day I’d like to write a statspack like version of the lot…

Windows 8.. I just don’t get it…

First the caveats:

  • Remember I said Apple iPad… I just don’t get it… Then promptly went out an bought one. I now use it most days for surfing and checking my emails from bed. :)
  • Windows 8 is pre-beta, so hopefully a lot will change between now and then.

I totally understand the concept of the new front screen and the whole Metro thing. Trying to keep a consistent experience between a Windows phone and a Windows touchpad is sensible. Just like the iPhone and iPad. What I don’t like is the fact the tiles are massive and take up loads of space. It just seems a bit silly to me. Why make me sideways scroll when all the initial options could be seen on my 24-inch monitor anyway? From a desktop computing perspective, it is so much worse than the Apple Launchpad (which I also despise) or the GNOME3 Activities screen.

Since I’m running it on a desktop machine, my biggest concern is getting a regular desktop to work with. I can do this by clicking the “Desktop” tile. The resulting desktop is basically Windows 7, which is fine, *except* there is no regular start menu. Clicking the Start button takes you back to the crappy tiled front screen, or hovering in the bottom-left corner presents you with the new menu. What is on this new menu? Bugger all of any use! The search screen is like a really bad GNOME3 “Activities” screen. It requires so many clicks and mouse moves to get where you want to go. It’s wretched. If I were a regular user I think I would probably pin a whole bunch of apps to the taskbar and maybe define a few folders on desktop containing useful shortcuts. Surely the ability to run the old Windows 7 menu would be a welcome addition for the vast majority of users!

Every dialog now has a ribbon instead of a toolbar or menu. This may prove useful for the newbies as it displays functionality that may have been hidden in sub-menus, but for me it is a disaster. The top inch of very window is filled with a bunch of crap that I don’t care about most of the time.

Typically the early releases have lots of tracing code enabled, so I don’t expect the production release will be as slow as this developer release.

So what is the future of the desktop computer? The rumors are that the next iteration of Macs will be essentially running iOS. It looks like the next generation of PCs will be running Windows 8. Although both these OSes seem fine for phones and touchpads, neither of them seem appropriate for a desktop computer. Now I realize that I am by no means a typical PC users, so maybe the vast majority of the PC users of the world will be happy with these changes, but I for one think it is a massive step backwards. It is starting to look like the future of desktop computing is Linux. :) Luckily, I’m already there.

Let’s hope a little sanity returns between now and the production release of Windows 8. If nothing else, just give us a proper menu, or fix that God awful search screen.

Update: Check out these hacks to restore the Windows 7 style menu.



PS. Let’s see if I end up contradicting everything I just said in a few months time. :)

Oracle Database 11g Direct NFS + Real Application Clusters + VMware vSphere + Fully Automated Storage Tiering? Yes! Of Course!

This is just a quick blog entry to draw attention to a freshly-released EMC white paper.  I don’t aim to turn my blog into an announcement board for such things, but this one is worth it. When I posted my announcement that I’d left Oracle Server Technologies (Exadata development) to join EMC I should have also made the point of how much interest I have in virtualization.

Virtualization (Done Right) and Oracle Database 11g Direct NFS
My convictions regarding the importance of virtualization in modern data center architecture are quite strong. That’s a significant portion of the motivation behind why I left Oracle to join EMC and one of the reasons I really like this paper.  But that’s not all. The paper also centers on Oracle Direct NFS technology. Regular readers know my strong backing of dNFS as well as my long history with the technology.  New readers can visit the past on that matter by reading my many posts on the topic.

The following is a short list of items covered in the paper:

  1. FAST (Fully Automated Storage Tiering). I cannot say enough about the importance of taking care to cache both reads and writes! Imagine that! Not all caching schemes possess that critical attribute! If you try really hard you can probably think of a few dynamic caching solutions that are really good at caching clean data and offer no benefit for writes.
  2. Oracle Real Application Clusters OLTP performance with the Oracle Database 11g Direct NFS feature with and without FAST (Fully Automated Storage Tiering) on physical servers.
  3. Oracle Real Application Clusters OLTP performance with the Oracle Database 11g Direct NFS feature with and without FAST (Fully Automated Storage Tiering) on a VMware vShere virtual servers.
  4. Loads of configuration how-to’s
  5. Coverage of live migration from physical to virtualized Oracle Real Application Clusters. This is my personal favorite in this paper!

Here is a link to the paper: EMC Performance for Oracle – EMC VNX, Enterprise Flash Drives, FAST Cache, VMware vShere

I wanted folks to get this paper as soon as possible so they’d have something good to read during their flight to Oracle OpenWorld 2011. If you read it you might come up with some difficult questions to pose to the EMC folks in Booth 901 :-)   Go ahead, give it a go. Tell them I said, “Hi” because this will be the first OOW I’ve missed since 1997.

Filed under: oracle

VirtualBox Cloning: A minor warning…

The cloning feature in VirtualBox was a welcome addition, but there are a couple of fringe issues to be aware of:

  • If you use the cloning feature to clone a VM that has shared disks (like in a VM RAC setup), the shared disks are also cloned, so you end up with a new VM that is not accessing the original shared disks, but has a new set. I’ve put a note about this in my VirtualBox RAC articles and suggested you still use the old method of cloning the virtual hard disk manually. I guess for most people this is not a big deal.
  • The virtual disks of a clone get placed in the default location. Once again, not a big deal unless you try to spread your virtual disks onto different spindles to get better performance.

Like I said, these are very edge-case issues and not a reason for most people to avoid the cloning feature.



Brain Teaser: Why is this Query Performing a Full Table Scan

September 14, 2011 While taking a look at the OTN threads this morning, I found an interesting test case, where the OP was asking why a query used a full table scan and not an index range scan.  The OP would like for the query to use the index without using a hint in the query.  [...]

Delphix ROI

People are frequently asking me what I’m doing at Delphix after all the years I spent at Oracle, Quest and Embarcadero working on performance tools.  Though I love performance tools , the impact of performance tools is leveling off where as the technology Delphix brings to the market  is making a massive difference to our customers.  If you are curious to learn more the check out Delphix’s live webcast on what we do and let me know what you think:

Enterprises create up to 10 copies of each production database for development, testing, support and other purposes. Simply provisioning or refreshing a single copy can take days to weeks. Across applications this introduces massive storage costs and productivity inefficiencies.
Delphix uniquely addresses those challenges through patent-pending data virtualization software. Organizations such as Procter & Gamble, Stryker, Comcast, Sanofi Aventis, Qualcomm, Macy’s and many others are leveraging Delphix to realize massive returns from:
We would like to invite you to attend our upcoming ROI webcast to learn how Delphix can do the same for your organization.

Join us on Sep 29 @2PM ET to learn more. Click Here to Register