Search

Top 60 Oracle Blogs

Recent comments

Technical Blog

Handling Human Errors

Interesting question on human mistakes was posted on the DBA Managers Forum discussions today.

As human beings, we are sometimes make mistakes. How do you make sure that your employees won’t make mistakes and cause downtime/data loss/etc on your critical production systems?

I don’t think we can avoid this technically, probably working procedures is the solution.
I’d like to hear your thoughts.

I typed my thoughts and as I was finishing, I thought that it makes sense to post it on the blog too so here we go…

The keys to prevent mistakes are low stress levels, clear communications and established processes. Not a complete list but I think these are the top things to reduce the number of mistakes we make managing data infrastructure or for that matter working in any critical environment be it IT administration, aviation engineering or medical surgery field. It’s also a matter of personality fit – depending on your balance between mistakes tolerance and agility required, you will favor hiring one individual or another.

Regardless of how much you try, there are still going to be human errors and you have to account for them in the infrastructure design and processes. The real disasters happen when many things align like several failure combined with few human mistakes. The challenge is to find the right balance between efforts invested in making no mistakes and efforts invested into making your environment errors-proof to the point when risk or human mistake is acceptable to the business.

Those are the general ideas.

Just a few examples of the practical solutions to prevent mistakes when it comes to Oracle DBA:

  • test production actions on a test system before applying in production
  • have a policy to review every production change by another senior member of a team
  • watch over my shoulder policy working on production environments – i.e. second pair of eye all the time
  • employee training, database recovery bootcamp
  • discipline of performing routing work under non-privileged accounts

Some of the items to limit impact of the mistakes:

  • multiples database controlfiles for Oracle database (in case DBA manually does something bad to one of them – I saw this happen)
  • standby database with delayed recovery or flashback database (for Oracle)
  • no SPOF architecture
  • Oracle RAC, MySQL high availability setup (like sharding or replication), SQL*Server cluster — architecture examples that limit impact of human mistakes affecting a single hardware component

Both lists can go on very long. Old article authored by Paul Vallee is very relevant top this topic — The Seven Deadly Habits of a DBA…and how to cure them.

Feel free to post your thoughts and example. How do you approach human mistakes in managing production data infrastructure?

Consolidation Is All About Costs

Back in February, Jonathan Gennick asked me if I would be interested in writing a bit of content for an APRESS brochure to distribute at RMOUG Training Days. I thought it was a cool idea and chose the topic of database consolidation. I only needed 10 short tips but when I started to write, it was difficult to stop — clearly, expressing ideas in concise way must not be my strength.

Jonathan did heavy edits and turned my draft into 10 brief tips and, of course, quite a few details had to go as we shrank the size 3-4 times. Since I’ve already put my efforts into writing, I figured I could share it as well on my blog. Thus, welcome the first blog post from the series of database consolidation tips. Let’s get down to business…

While there are often multiple goals of a consolidation project, the main purpose of consolidation is to optimize costs which usually means minimizing the Total Cost of Ownership (TCO) of data infrastructure. Your current hardware might be past end of life, you might lack capacity for growth, your change management might be too slow, etc.

These non-core goals (for the lack of a better term for side effects of consolidation projects) can play a role of timing triggers but success of a consolidation project is defined by cost metrics. In real-life there are very few pure consolidation projects as project success criteria usually contain other conditions than cost cutting.

Tip: Keep costs at the heart of the consolidation project but don’t get blinded by cost alone! It’s a total failure if a consolidation project delivers a platform with much lower TCO but is unable to support the required availability and performance SLAs.

It’s also not very popular to run a purely cost-cutting project in a company — people are not overly motivated especially if it endangers their jobs. Luckily, most healthy businesses have quickly growing IT requirements and consolidation projects very quickly bust out of the scope of just cost savings.

Tip: Get your success criteria right and keep cost optimization as the core goal. If required, reduce the scope and split projects into stages where each stage has it’s own core goal. This way, you can classify some stages as purely consolidation. It’s so much easier to achieve success if there are only few criteria. You could also check mark success boxes constantly as you go instead of trying to get to the light at the end of the tunnel that could take years.

If you have anything to share on the scope of consolidation projects — whether past experience or current challenges — please, comment away.

How to Get Started with Amazon EC2 (Oracle 11g XE example)

I’ve just published Oracle Database 11g Express Edition Amazon EC2 image (AMI) but most of you have never used Amazon EC2… Not until now! This is a guide to walk you thorough the process of getting your very first EC2 instance up and running. Buckle up — it’s going to be awesome!

  1. Go to Amazon Web Services and open an account. You could use one that you buy your books with.
  2. Go to AWS Management Console for EC2 and sign up for Amazon EC2. You will need your credit card for this. You will not be charged anything unless you are either start using EC2 instances or allocate EBS storage and other related items. The sign-up page shows you all the pricing. You will especially like “Free tier for new AWS customers” section that gives you 750 hours of Micro instance uptime, 10 GB of EBS storage some bandwidth and few small goodies. This mean that you will not be charged anything in the beginning of your experiments. They will also do phone verification — I can’t remember I’ve seen it last time so it must be reasonable new. Works for cell phones too. Activation usually takes just few minutes and you’ll get an email confirmation and you get access to EC2, VPC, S3 and SNS. Direct link to AWS Management Console for EC2
  3. Now you can launch your first instance. So let’s start Oracle 11g XE beta image that I published just recently. Click “Launch Instance” then select “Community AMIs” tab. It will start loading AMIs list and it will take ages so don’t wait for it to finish and search for “pythian” – you will get pythian-oel-5.6-64bit-Oracle11gXE-beta image with AMI ID ami-e231cc8b the latest at the time of this writing.

    Select that image.
  4. On the next tab choose instance size. It’s enough to use Micro instance to start playing with Oracle 11g XE but be prepared that Micro instance doesn’t guarantee any CPU capacity so it might be “bursty” but, hey — it’s free or costs peanuts if you run out of free time. You could also choose an availability zone closer to you.
  5. On the next screen leave everything by default. You could select what to do when you shutdown the instance from inside the instance. Stop will keep your instance and EBS storage allocated and you can start it and all your changes will persist. However, you will be charged for allocated EBS storage (if you go beyond free 10GB) but it’s very little. “Terminate” will actually release EBS storage if you shutdown your instance. Note that you can always stop and terminate instances from AWS Management Console. I usually leave option on “Stop” to avoid accidental data loss. You can skip defining any tags — this is optional metadata so you can orient better in your instances. I recommend you at least specify a descriptive name to make sure you clearly distinguish multiple running instances later.
  6. If you didn’t have a Key Pair created in the past, you will do that at the next step. This is basically public / private key pair and you get to download private part — save it and keep it safe and don’t share this .pem file with anybody. Someone with access to it can gain root access to your images! You can always create more than one Key Pair by the way.
  7. Next, you will need to either select an existing Security Group or create a new one. Default security group doesn’t fit us because you want to open other ports to access you 11g XE database. You can keep default group and only access by SSH if local access from SQL*Plus command prompt is all you need. It’s also the safest way but for your playground, you might want more flexibility. For 11gXE instance you will probably want SSH access (port 22), SQL*Net access (port 1521) and APEX access (port 8080). I also like to open ICMP for ping. Be sure you understand what you are doing if you will be placing any sensitive data there. I also open it to the world (source 0.0.0.0/0) so anybody who knows the passwords or have correct shard keys setup, can get on your instance. You can limit it to your current IP only (and you can change the policy online if you IP changes later — use AWS Management Console). There are bunch of site that would tell you your public IP (providing you don’t use a proxy coming from another IP) like this one. To limit access from that IP only enter it in the source as xxx.xxx.xxx.xxx/32. Of course, you can enter subnets too if you know what I’m talking about.
  8. That’s it — all that’s left is click the “Launch” button.
  9. You will then see your image as “pending” in the console and usually just seconds later it switches into “running” state. Note that it will take a minute or so to boot and launch sshd daemon so you can connect via SSH. You can also check console log by choosing “Get System Log” from the context menu (it does take few minutes usually so it will come back empty until then). The easiest way to connect is to choose “Connect” from the context menu — it will present you instructions to connect as root using the .pam key file you downloaded when creating your Key Pair earlier on. Note that if you are on Unix, you will need to set proper permission for your key to ensure safeguarding — chmod 600 AlexG.pem.

    You can also get the public IP alias from instance details as “Public DNS” – just select and instance and scroll details in the bottom pane. For that particular image, I also enable public key authentication so you can simply add your public key to oracle’s ~/.ssh/authorized_keys file — it’s already there with correct permissions. This way I don’t have to go via root every time.
    If you are a Windows user using Putty, you can convert your .pem file into Putty Private Key (.ppk) file following Marcin’s comment.
  10. Database and listener will auto-start. You can open 11g XE web interface. In my example it’s http://ec2-50-17-156-24.compute-1.amazonaws.com:8080/apex/apex_admin for Administration and http://ec2-50-17-156-24.compute-1.amazonaws.com:8080/apex for APEX web user interface. Note that it’s not SSL connection so you don’t want to use it for any sensitive data unless you reconfigure to https. This is also the time you want to change passwords from default ones.
  11. You can access your database over SQL*Net via sqlplus, SQL Developer or any other tool.
  12. You will see the instance and EBS volume attached in your AWS Management Console. If you stop the instance, you will see that the EBS volume is still attached so you data is still there when you start it. If you terminate the instance, all you changes and data will be gone since the EBS volume will be detached and deleted. You can, however, launch another instance as many time as you want from the same AMI. Just make sure you change the passwords after the launch!

That’s all — you can now start playing with Oracle 11g XE without paying a penny (or very little), without consuming any resources on your own laptop/desktop and have as many of them running as you want. And you can always start from scratch if you screw something up.

Oracle Database 11g XE Beta — Amazon EC2 Image

That’s right folks! Playing with latest beta of free Oracle Database 11g Express Edition couldn’t be any easier than that. If you are using Amazon EC2, you can have a fully working image with 64 bit Oracle Linux and Oracle 11g XE database running in a matter of few clicks and a minute to get the instance to boot.

Image — ami-ae37c8c7
Name — pythian-oel-5.6-64bit-Oracle11gXE-beta-v4
Source — 040959880140/pythian-oel-5.6-64bit-Oracle11gXE-beta-v4

You can find it in public images and at this point it’s only in US East region.

If you never used Amazon EC2 before, see detailed step-by-step guide on how to get started with EC2 on the example of this 11g XE image.

This image works great with Amazon EC2 Micro instance and I configured it specifically for Micro instance. Micro instance costs you only 2 cents per hour to run or even less than 1 cent if you are using spot instance requests (and there is free offer for new AWS users as Niall mentioned in the comments).

So what’s there?

  • Oracle Enterprise Linux 5.6 64 bit (I started with 5.5 and updated to the latest)
  • Oracle Database 11g XE Beta (oracle-xe-11.2.0-0.5.x86_64)
  • Database created and configured to start on boot
  • APEX coming with 11g XE configured on port 8080 and remote access enabled
  • 10GB root volume on EBS with 5+GB free for user data. You could store up to 11GB of data in 11g XE and there is a way to grow volumes if you need but for more critical use then playground, I’d allocate separate EBS volumes anyway.


Few things worth to mention:

  • I enabled public key authentication (“PubkeyAuthentication yes” in /etc/ssh/sshd_config) so you can setup shared key to login directly as oracle OS user – just copy your public key to /home/oracle/.ssh/authorized_keys.
  • SYS and SYSTEM password is “pythian”. Change it!
  • ADMIN password in APEX is “pythian” — change it on the first login.
  • Micro instance has 613 MB of RAM and no swap — no instance (ephemeral) storage.
  • Oracle database and listener autostarts on boot. You can use /etc/init.d/oracle-xe stop/start as root too.
  • listener.ora has been modified to include (HOST=) so that it starts on any hosname/IP.
  • APEX remote access is enabled! DBMS_XDB.SETLISTENERLOCALACCESS(FALSE)
  • Ports 1521 and 8080 are open to the world on local iptables firewall. You still need to configure proper Security Group to be able to access those ports.
  • Access APEX on http://{public-ec2-ip}:8080/apex and admin on http://{public-ec2-ip}:8080/apex/apex_admin. There is currently an issue that APEX stops working after few minutes of run-time returning 404 code. Might be a bug in beta or installation issue (for example, I run it with no swap on Micro instance).

I will be keeping the AMI up to date as things develop so AMI id could change — check back here of just search public AMIs for the latest image. I setup short URL for this page — http://bit.ly/Oracle11gXE.

If you don’t know how to use Amazon EC2 – I recommend to read the second chapter of Expert Oracle Practices: Oracle Database Administration from the Oak Table. This chapter was written by Jeremiah Wilton who’s been long time playing with Amazon EC2 for Oracle before any of us even thought of it.

When few folks confirm that it works, I’ll submit an image vi http://aws.amazon.com/amis/submit.


Update 4-Apr-2011: Create v3 image – fixed typo in database passwords, fixed retrieval of public key for ssh login as root, changed startup sequence so that ssh keys are initialized earlier as well public key retrieval.
Update 4-May-2011: Created v4 image – Increased SGA size to 212M. Set large_pool to 32M (Automatic SGA management doesn’t do it’s job properly – this is why APEX was not working – not enough large pool memory allocated). Enabled DIRECT IO and ASYNC IO for filesystem – buffered IO slowed down things a lot. Now APEX is actually pretty usable on Micro instance. Remember that you can run it on large instance to run in comfort but you are overpaying since there is 2 CPUs in large instance and 7.5GB of RAM while you can’t use more than 1GB. Of course, you could disable Direct IO and use OS buffering to take advantage of more RAM but can’t leverage both cores with APEX (it limits capacity to a single core).
Update 23-Jul-2011: If you need to use networking services from APEX (like web-service, sending emails and etc) then you need to configure network ACLs for APEX_040000 user.

Congrats to Fahd Mirza on becoming an Oracle ACE

Last week brought great news to Pythian — one of our DBAs in Pakistan, Fahd Mirza, has become an Oracle ACE. Fahd joined Pythian in September 2010 as the very first Pythian employee in Pakistan and thanks to his skills and ambitions ended up on the team supporting Exadata environments. Fahd is a long standing active community member, frequent blogger and passionate Oracle technologist evangelizing for Oracle technology in Pakistan. No wonder he got nominated as an Oracle ACE and was accepted.

I should also mention that another Oracle ACE DBA joined us recently Jared Still. Jared is a well respected member of Oracle community, member of the OakTable Network and a veteran of the Oracle-L mailing list. Jared is a top notch Oracle DBA and huge fan of the most popular programming language at Pythian — Perl — and event wrote a book “Perl for Oracle DBAs“.

With all this, we have 5 Oracle ACEs & ACE Directors at Pythian now including Gwen Shapira, Christo Kutrovsky and myself. But that’s not all, Pythian is known as an incubator for Oracle ACEs (I think we were called the Oracle ACE Factory in one of the Oracle ACE newsletters) and it’s been a pleasure to have worked side by side with other Oracle ACEs and ACE Directors — Riyaj Shamsudeen, Doug Burns, Sheeri Cabral and Dmitri Volkov. Some of them became Oracle ACEs at Pythian, some before or after that and even though they are not working at Pythian now, they are still our good friends and help us out on many occasions with training or collaborating on exciting projects.

It’s a great initiative by Oracle through the Oracle ACE program to recognize active community contributors and passionate Oracle professionals around the globe! Well done Oracle!

Oracle Database Consolidation — What’s Your Story? (book prize)

Dear blog readers,

I’m working on a small story about database consolidation and interested to learn what are success and failures that others are going through. While we have our own experience at Pythian, I find it interesting to learn about what others are going through. If you have enough details, it would be nice to see your feedback along those lines.

1. Why consolidation project started – targets?
2. What were expectations / success criteria and how they were set?
3. What was the scope of the consolidation project.
4. Expected time-frame and whether you are done by now.
5. Was the project considered successful? Goals met (see item 2)?
6. What were the measurements before and after? Were there any?
7. Issues faced and how they were solved or worked around.
….
Interesting facts like platform, number of databases, versions, consolidation strategy and etc.
….

Sharing your experience here would be beneficial for the community at large. Besides, don’t you want to win a book?

Expert.Oracle.Practices

I myself contributed the first chapter to this book but the rest of the authors are really awesome! ;-)

To win the book, you need to share your experience and provide details. Addressing the items I mentioned would be great but if you don’t have the whole picture, you might miss some of it so just tell us your story. Of course just few sentences won’t qualify you for a story teller so I’ll use 1000 characters as a guideline threshold to qualify for a draw but I won’t follow it blindly — insights into your consolidation project is what counts!

Don’t forget to enter a proper email address (remember that it’s not shared) when entering a comment so that I can follow up in case you get the prize.

Oh… and there is a deadline! You get time until tomorrow (at the time of writing) – 11:59pm EST 18-Jan-2011. Feel free to share even later but the prize will be gone by then!

Thanks in advance for all your comments!

Alex

News from UKOUG 2010 Conference

Right now I’m sitting in the speaker lounge with Jeremy Schneider after hacking some RAC ASM stuff as a follow up to my last presentation. We were testing some failure scenarios but that’s a topic for another blog post.

Dan Fink cheated with his tiny blog post which was more like a twitt-long (and so did Christo.) so I thought to write something properly.

Monday started early for me — 6am. Quick run through my demos again and early breakfast. Registered before 8am while it was still empty and then joined Tom Kyte in the speaker lounge. We both had our sessions starting at 9am but Tom is a Pro when it comes to presenting — while I was taking the last minutes to go through my slides and do minor adjustments, Tom was calmly replying AskTom questions. Oh well, such is life.

My 2 hours presentation was a little slow and I wish the audience was a little more engaging but maybe it was just because all the locals hit the hibernate mode following “extreme” cold weather and didn’t quite wake up after the weekend (of course, there is not chance that it was bad presentation material or speaker…. no, no!). This was the same presentation as I’ve done at the OpenWorld but I included demonstrations of 11gR2 Grid Infrastructure and that was the tricky bit. In the end, everything pretty much worked with one small surprise. My last demo was troubleshooting of startup and I decided that I will screw up 3 things and troubleshoot online *first* time. I.e. I decided deliberately not to practice it. The latter wasn’t very smart as I had less then 10 minutes left. After few minutes of shame I had to move this demo in the list of homework. :) The good news, that I did go through my last slides briefly and I wanted to be brief there as Frits Hoogland was covering this area in more details later that day in his own session.

Exhausted after my session (and slightly disappointed by inactivity of the audience), I was very hungry but decided to wait until the official lunch time so ended up in Graham Wood’s session on some hidden free gems along with Christo and Jeremy. We did have a plan to switch to the Exadata round table and did just that. I’ve got a little disappointed with the round-table format (as I’m writing this — discussed it with round-table moderator, Joel Goodman, here in the speaker lounge and we agreed on this) – it was more like a presentation without slides with introduction to Exadata for folks who don’t have any background knowledge. It should have been a presentation while the round-table should have been left for the folks with experience or knowledge of Exadata technology.

Having been late from Exadata round-table I was late to lunch. This means I was late to the next session and managed to sneak in the last 10 minutes for Frits Hoogland‘s Oracle Clusterware 11gR2 In-depth. It looked like the audience wasn’t very active during his session either. When I spoke to Cary Millsap later, he also mentioned that it was somewhat a struggle to get the audience to laugh so he had to leverage his special jokes from the reserve list. I might borrow some of them in the future. :)

Next I went to the Tanel Poder‘s presentation on Exadata migrations and related performance tuning. Very insightful as you can expect from Tanel. He also confirms that Exadata performance rocks but it can be tricky to run stable. I think our experience was somewhat better with stability except early months when lots of issues were not fixed.

Afterward, I went to Cary Millsap‘s presentation on reading 10046 trace files. My intention was not to learn the subject that I was already familiar with but to learn how Cary can present this topic in his new style (with very few words on the slides). Turned out that he did put trace content on the slide but it was interesting to see how he emphasizes what he really wants to talk about in the 20 lines of code on the page. I will borrow this for my future presentations.

The final session of the day was Julian Dyke’s replication internals. It’s been a while I wanted to dig into replication deeper so it was a good move to go there. However, after such an active day (and night), I was struggling to stay awake even though my brain was desperately trying to keep up. The good news is that Julian have very well illustrated slides so I can always get back to it.

That evening, we had OakTable dinner. It was, of course, at the Indian restaurant. I admit I abused that place and barely was able to walk after dinner and struggled to consume anything more that evening. Still, nothing stopped us from hanging in Tap & Spile until almost 2am and even catch the last order of scotch at Jury’s. That was another abuse of the night especially that I had a presentation to deliver next day. Fortunately, the next day consequences were very mild but that evening brought me the idea of a demo for my presentation (thanks Christo) and most of Tuesday I spent getting this demo ready. It worked very well but I will need to improve few items to run it faster.

It’s already Wednesday as I’m finishing this post now. The night was lots of fun and it was long and… very late. I recall that the most bizarre idea of that evening was robbing a bank (don’t ask how we got there… it was not my fault). I didn’t really pay attention when I was back in my room and crashed but I see my last email from the phone was sent at 4:25am (Hi Doug!).

I missed the presentation on marrying Grid Control and Nagios — very interesting topic for me as some of our customers happen to use both. I struggle to understand why one would want to integrate these tools but that’s why I really wanted to see it. Oh well, I have to review the slides offline and I’ve met the author the evening before so I could always contact directly (thanks Eter!).

Half more day to go. Still struggling to decide whether I should go to Julian Dyke‘s presentation on memory (I know Christo did his this morning but it was way too early for me) or to the session on RAC Server Pools by Bob Mycroft (somehow, his name is associated with Windows – is it just me?).

Oh… I completely forgot to mention that the highlight of Monday night was Doug Burns shaving ceremony and finalizing it at Tap & Spile. I have captured some videos but they need some post-processing before I can publish them. Another highlight was watching photos of previous UKOUG conferences I attended and I specifically liked one photo that was not supposed to be there! It won’t make sense to you my dear reader but please forgive me and ignore it — it’s meant only for one of you. ;-)

Pythian at UKOUG Technology and E-Business Suite Conference 2010

Hello Birmingham!

It’s past Sunday midnight and I’m stuck in my room in the last couple hours finishing my slides for my masterclass tomorrow. Turns out that I’m presenting the very first session of the conference at 9am. I wish there is a keynote instead so that I could grab one more hour of sleep (it’s going to be deep into the night back home in Canada). Strange that the keynote was moved to Wednesday — I hope UKOUG has really good reason for that!

My two hours masterclass will start at the same time as Tom Kyte’s a-la keynote session — what a competition. On the other hand, there is no other sessions in server technology so I expect that folks without interest of database development will automatically end up in my session. I’m in Hall 5 – quite large room. Is it the second biggest room after the Hall 1?

I will need to work hard to keep the audience… maybe I shouldn’t plan for any breaks to make sure I don’t let folks slip out to the next sessions like James Morles’ Sane SAN 2010 or Jeremy Schneider’s Large Scale ASM.

My masterclass is based on the slides that I presented at the Oracle OpenWorld few months ago which, in turn is reworked session on Oracle Clusterware internals that I’ve done number of times as long session with demos. I thought updating this material to 11gR2 would be easy… Boy was I wrong!

11gR2 Grid Infrastructure has changes so much that it took me much much longer to get something sensible ready. I also had to limit the scope a bit as Grid Infrastructure has become so much more complex than older pre-11gR2 Clusterware. (stop complaining Alex!)

Anyway, everything is ready now and demos look reasonable. It will be a bit rough doing it first time – I’m sure I’ll stumble few times but fingers crossed we get to the end timely. I actually hope to finish early and allocate a bit more time for Q&A and potential ad-hoc demos at the end. But enough about me…

Who from Pythian are at the UKOUG conference this year? In additional to myself, it’s Christo Kutrovsky, Daniel Fink, Paul Vallee and Andrew Poodle. Christo, Dan and myself are presenting, Andrew is helping organization of MySQL track as a MySQL SIG Chair and Paul… well, I’d say Paul is a slacker so he is covering the beer tap to pay up! :)

It’s close to 2am – gotta get some sleep before tomorrow. Few words against Jurys Inn Hotel this year. It’s the first year I’m having so much troubles here including no early check-ins, not working phones, no internet in two rooms (I had to switch twice!), and somewhat unfriendly stuff this year. Has hotel management change since last year or what? Will consider another hotel next time I think.

Oh… and it’s indeed bloody cold here! So cold that it seems to impact the amount of girls-who-forgot-their-skirts-at home at the Broad Street. This unusually cold weather does impact travel plans of other conference speakers and attendees. Doug Burn seems to have been delayed for like a day and barely made it to have a pint at Tap&Spile – I wish I could accompany the crowd there until late but thanks to the awesome schedule (and unfinished state of my presentation, to be fair) I had to miss some of the fun.

PS: I have another session on Tuesday — Analysis of Oracle ASM Failability (should be Fallibility I guess but I’ll keep it misspelled simply because I can!). If anybody wants to catch up for any reason (like buying me a beer) — text me at +1 613 219 7031. iPhone doesn’t work with data-plans here for unknown reason so no twitter/email on the go.

Oracle Exadata Database Machine v2 vs x2-2 vs x2-8 Deathmatch

This post has bee updated live from the Oracle OpenWorld as I’m learning what’s new. Last update done on 28-Sep-2010.

Oracle Exadata v2 has been transformed into x2-2 and x2-8. x2-2 is just slightly updated while x2-8 is a much more high-end platform. Please note that Exadata x2-2 is not just an old Exadata v2 — it’s a fully refreshed model. This is a huge confusion here at the OOW and even at the Oracle web site.

The new Exadata pricing list is released and Exadata x2-2 costs exactly the same as old Exadata v2. Exadata x2-8 Full Rack (that’s the only x2-8 configuration — see below why) is priced 50% higher then Full Rack x2-2. This is hardware price only to clarify the confusion (updated 18-Oct-2010).

Exadata Storage Server Software pricing is the same and licensing costs per storage server and per full rack is the same as for Exadata v2 because number of disks didn’t change. Note that storage cells got upgraded but priced the same when it comes to Exadata Server software and hardware. Nice touch but see implications on databases licensing below.

This comparison is for Full-Rack models Exadata x2-2 and x2-8 and existing v2 model.

Finally, data-sheets are available for both x2-2 (Thx Dan Norris for the pointers):

http://www.oracle.com/technetwork/database/exadata/dbmachine-x2-2-datash...

and x2-8:

http://www.oracle.com/technetwork/database/exadata/dbmachine-x2-8-datash...

It means that live update of this post is probably over (27-Sep-2010).

v2 Full Rack x2-2 Full Rack x2-8 Full Rack
Database servers 8 x Sun Fire x4170 1U 8 x Sun Fire x4170 M2 1U 2 x Sun Fire x4800 5U
Database CPUs Xeon E5540 quad core 2.53GHz Xeon X5670 six cores 2.93GHz Xeon X7560 eight cores 2.26GHz
database cores 64 96 128
database RAM 576GB 768GB 2TB
Storage cells 14 x SunFire X4275 14 x SunFire X4270 M2 14 x SunFire X4270 M2
storage cell CPUs Xeon E5540 quad core 2.53GHz Xeon L5640 six cores 2.26GHz Xeon L5640 six cores 2.26GHz
storage cells CPU cores 112 168 168
IO performance & capacity 15K RPM 600GB SAS or 2TB SATA 7.2K RPM disks 15K RPM 600GB SAS (HP model – high performance) or 2TB SAS 7.2K RPM disks (HC model – high capacity)
Note that 2TB SAS are the same old 2 TB drives with new SAS electronics. (Thanks Kevin Closson for ref)
15K RPM 600GB SAS (HP model – high performance) or 2TB SAS 7.2K RPM disks (HC model – high capacity)
Note that 2TB SAS are the same old 2 TB drives with new SAS electronics. (Thanks Kevin Closson for ref)
Flash Cache 5.3TB 5.3TB 5.3TB
Database Servers networking 4 x 1GbE x 8 servers = 32 x 1GbE 4 x 1GbE x 8 servers + 2 x 10GbE x 8 servers = 32 x 1Gb + 16 x 10GbEE 8 x 1GbE x 2 servers + 8 x 10GbE x 2 servers = 16 x 1Gb + 16 x 10GbEE
InfiniBand Switches QDR 40Gbit/s wire QDR 40Gbit/s wire QDR 40Gbit/s wire
InfiniBand ports on database servers (total) 2 ports x 8 servers = 16 ports 2 ports x 8 servers = 16 ports 8 ports x 2 servers = 16 ports
Database Servers OS Oracle Linux only Oracle Linux (possible Solaris later, still unclear) Oracle Linux or Solaris x86


x2-8 has fewer but way bigger database servers. That means that x2-8 will scale better with the less RAC overhead for the databases. The bad news is that if one database server fails or down for maintenance, 50% of capacity is gone. What does that mean? It means that Exadata x2-8 is designed more for multi-rack deployments so that you can go beyond “simple” 2 node RAC. Some folks argue that two node RAC is less reliable for evictions and etc but you probably don’t know that Exadata has special IO fencing mechanism that makes it much more reliable.

Because there is 4 times more RAM in Exadata x2-8, more and more operations can be done fully in memory without even going to storage cells. This is why boost in number of cores / CPU performance is important — since InfniBand bandwidth stays the same, you need some other way to access more data so having more data on buffer cache will keep more CPU cores busy.

With Exadata x2-2, processing capacity on database servers increased and RAM increase is insignificant. So how does it impact “well-balanced” Exadata v2? Well, if more and more operations are offloaded to storage cells then database servers could have more “useful” data pumped in over InfniBand and actually spend CPU cycles processing the data rather then filtering it. With Exadata v2, depending on the compression level, CPU was often a bottleneck on data loads so having some more CPU capacity on database tiers won’t harm.

Old configuration v2 will not be available so be ready to spend more on Oracle database licenses unless you are licensed under ULA or something.

Both Exadata x2-8 and x2-2 will run updated Oracle Linux 5.5 with Oracle Enterprise Kernel. x2-8 can also run Solaris x86 on database servers as expected. This confirms my assumption that if Oracle adds Solaris x86 into Exadata, it will prove that Oracle is fully committed to Solaris Operating System. A rather pleasant news to me! However, Solaris 11 Express is not available right now and probably will be available towards the end of this calendar year.

If you look at x2-2 and x2-8 side by side physically, you will see that four 1U databases servers of x2-2 basically replaced by one 5U database server in x2-8 in terms of space capacity. There are also more internal disks in those bigger servers and more power supplies so they are more redundant.

More processing power on storage servers in x2-8 and x2-2 (not dramatically more but definitely noticeable) will speed up smart scans accessing data compressed with high level. As more and more operations can be uploaded to the storage cells, boost in CPU capacity there is quite handy. Note that this doesn’t impact licensing in any way — Exadata Storage Server Software is using number of physical disk spindles as the licensing metric.

Regarding claims of the full database encryption — need to understand how it works and what are the improvements. Oracle Transparent Data Encryption was available on Exadata v2 but had many limitations when using with other Exadata features. I assume that Exadata x2-x addresses those but need to follow up on details so stay tuned. I believe that customers of Exadata v2 will be able to take advantage of all new Exadata software features – the platform architecture hasn’t changed.

Liveblogging: Oracle OpenWorld 2010 Sunday Keynote (Exalogic)

Liveblogging announcements from Sunday’s Oracle OpenWorld Keynote.

It’s 5:36 PM now – stay tuned…

@fuadar: Exadata smoothie and java juice in moscone south #oow10

5:44pm: Larry couldn’t get his boat under the Golden Bridge — next yer he needs a smaller boat or rebuild the bridge? :)

5:50pm: Oracle Partners Specialization awards… oh well, why is Pythian not on stage with our 4 Specializations? :(

5:51pm: Wow… Ann Livermore, EVP of HP, is on stage… about HP Oracle partnership… I don’t supposed she will talk about Mark Hurd. :)

@gvwoods 40% of Oracle on HP

5:58pm: I was all pumped for Larry and getting bored now… come on already!

6:02pm: Hm… while HP is focused on services, I think Oracle’s strategy is to leverage partners for that. HP is pitching completely different approach then Oracle… and HP is talking about software they have… HP (h/w company) talks about their software at Oracle’s event (HP’s s/w partner)? Weird… Completely misaligned messaging!

06:07pm: @alexgorbachev: NOT INTERESTED in HP cloud solutions… audience is not even applauding – I hear snoring around… Give us Exalogic already!

06:07pm: Very interesting slides about HP storage – X9000 IBRIX (iBrick?) Indeed, NAS rocks for manageability

06:22pm: OK… pumping up again… I won’t be able to do it more than three times in a day! (my first pumped up state was at my presentation)

@paulvallee: KIIIIIIIILLLLLLLLLL MEEEEEEEEEEEE #oow10

06:24pm: Damn… they did it again :( I was just getting excited… I wonder if there is any time left to announce anything. Is Larry sleeping or late by any chance?

06:28pm: @paulvallee: RT @DarylOrts: #oow10. 41,000 attendees: 36,236 are currently asleep. Thanks #hp.

06:36pm: Don’t know if I can be excited again… Trying really hard now… I think I manged – pumped up!

06:41pm: @oracleopenworld: OK, sorry for the false start, but here we go now #oow10 – Larry intro video and keynote NOW

@alexgorbachev: @oracleopenworld false starts like that can cause loosing a race! #oow10

06:45pm: Larry is out…

Larry clarifies what cloud computing it according to Oracle. Calls SalesForce.com an “old SaaS Technology” and Amazon EC2 — “Innovative”.

06:51pm: @RoelH: @paulvallee Exalogic is on the machine in Larry’s back. #oow10

So Oracle’s definition of cloud computing is pretty much what Amazon.
Heh… I think Larry just stole slides from my presentation on Thursday!

06:54pm: Finally, Exalogic Elastic Compute cloud:
* Virtualization
* InfiniBand 40Gbit – so as expected no InfiniBand upgrade
* High performance storage
* 30 severs in “the box” (he calls it a box!)
* 360 cores (12 cores per server – I’m sure that’s 2 x 6 cores CPUs – expect Exadata v3 database server to use the same)
* Super simple patching – yes we like it!
* Guest OS’s – Linux and Solaris x86 (yay – I knew that)
* Apps hosted – WebLogic, Coherence, JRockit
* Virtualization is Oracle VM

Exalogic – Speed, Utility, Availability, Scalable, Manageable, Secure

Exalogic delivers 1 million HTTP requests per second.

2.8TB DRAM
960GB Solid state disks
1.2 microsecond latency
10Gbe connecton to data-center
40TB SAS disk storage
4TB read cache
72GB write cache

Tech geekery: “Looks like WebLogic has now node affinity working via UCP (instead of JDBC drivers) connecting to Oracle RAC – it can keep same web connection on the same RAC node.”

Exalogic will consolidate all apps that Oracle delivers (I guess if they run on Linux of Solaris x86).

1 Exadata rack and 1 Exalogic rack can run the whole Facebook according to Larry. I have troubles believing this but that’s a nice bold comparison.

You know what… it’s enough – off to ACED dinner – need to be at Pier 40 by 8pm.