Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Sqlplus is my second home, part 3: Colored selections in Windows XP command prompt

Whenever delivering some Oracle training or running a demo at a conference, I’ve always liked to use the Windows command prompt version of sqlplus.

One reason of course is its easy command line history navigation capability ( press F7 in cmd.exe after entering few commands to see why ).

Another reason is that whenever I want to highlight some part of sqlplus output, I can just drag a selection rectangle around that text. In other words I can “mark” the text – drawing the attention there. Of course as the selection rectangle is really meant for copy & paste operations only, it has several limitations. It’s not persistent, whenever I continue typing, the text “marking” will disappear.

Windows XP has introduced a really cool feature to cmd.exe, which anyone doing presentations involving some command line tool output will appreciate!

Basically XP allows you to persistently select and color command prompt output!

An example of what I’m talking about is here:

Sqlplus is my second home, part 2: Running SQL scripts from remote locations using HTTP

As you probably already know, the Session Snapper has been designed to be a very easy-to-use performance tool. It is especially useful in database environments where there are no decent performance tools pre-installed and available.

Snapper doesn’t require any setup, all you need is to log on to the database using sqlplus and download snapper.sql script to your computer.

Well, actually the second part is not required, as Oracle sqlplus allows you to run scripts from http and ftp locations!

C:>sqlplus "sys/oracle@ora92 as sysdba"

SQL*Plus: Release 10.2.0.3.0 - Production on Thu Aug 30 23:00:10 2007

Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.

Connected to:
Oracle9i Enterprise Edition Release 9.2.0.8.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production

SQL> @https://github.com/tanelpoder/tpt-oracle/blob/master/i.sql

Tanel's sqlplus http test...
https://github.com/tanelpoder/tpt-oracle/blob/master/i.sql

USER                           SYSDATE
------------------------------ ---------
SYS                            30-AUG-07

NAME
---------
ORA92

INSTANCE_NAME    HOST_NAME
---------------- ----------------------------------------------------------------
ora92            WINDOWS01

You should be *very* sure that noone can change the scripts on the server without your knowing!!!

SQL>

Cool stuff or what? :)

Let’s see how this relates to everyday DBA life…

Operating systems are lazy allocating memory

There was a discussion about whether Oracle really allocates all memory for SGA immediately on instance startup or not. And further, whether Oracle allocates memory beyond the SGA_TARET if SGA_MAX_SIZE is larger than it.

It’s worth reading this thread first: http://forums.oracle.com/forums/thread.jspa?threadID=535400&tstart=0

I will paste an edited version of my reply to here as well:

Advanced Oracle Troubleshooting Guide, Part 2: No magic is needed, systematic approach will do

There are two ways for diagnosing problems:

  1. Checking for usual suspects and hoping to find a matching one
  2. Following a systematic approach

Checking for usual suspects and hoping to find a matching one

The first approach relies on previous experience (both in particular subject area/technology and about the context/environment the problem occurs). For example if a patient comes to doctor complaining about pain in chest, then for doctor (and also for the patient) it would definitely be beneficial to know more relevant info about the patient – the context. If the patient had just fell off a 10-foot ladder, then it’d be more suitable to look for broken ribs. On the other hand, if the patient has been a long-time smoker and was watching TV on a couch when the pain started, then perhaps it’d be more suitable to start with an EKG (note that I’m not an expert on how human body works so should anybody complain about any pain in their chest to you, send them to real doctor immediately!)

Anyway, if you’ve been administering a database full time for last 5 years, you will probably know where to look immediately when a specific problem occurs. Note that I wrote “where to look immediately” here, not “what to change immediately”. Using previous experience to identify root causes of problems is obviously a perfectly valid approach which may get you to the solution very fast (again and again) – but it can work well only if you do have lots of previous experience in solving problems for that technology and that particular environment (in other words, you know the context). The big risk here is that if a “new” problem expresses itself in similar symptoms like the “old-and-well-known-problem”, we could easily end up looking for and fixing the wrong issue. And if that doesn’t work, continuing to try out a solution which helped with another problem last year. And if that doesn’t help then the cycle continues, we dig up even more unlikely fixes which have been useful once in past and we apply them. And then we resort to googling and trying out whatever solutions anyone has suggested for problems others have experienced.

We have ended up in Desperate Switch Flipping state. Usually this leads to flipping even larger switches, starting from Oracle session/instance parameters and adding random SQL hints up to restarting servers, upgrading databases, operating systems, hardware – usually without any luck. This wastes time, doesn’t solve our problem and may cause even more trouble. This is not good. Checking out the usual suspects may help solving common recurring problems (hey, shouldn’t good specialists avoid recurring problems rather than fixing them again and again?) . However it is very important to draw the line between checking for usual suspects and falling into DSF state, as looks like happened to someone at Oracle-L.

Quoting:

“I ran Statspack reports at the highest level of detail until I was blue in the face. I ran traces. I set events. But I also am by nature intuitive and tend often to use intuition to solve a problem with facts to back up my intuitive conclusion. So after providing all of this stuff to Oracle Support, they were at a loss, well, they were very eager to look at corruption as a cause, because they didn’t have another solution.”

(Btw, I think the DSF state would be a suitable addition to pathological DBA problems list which Gaja once started with CTD)

The other way to diagnose problems is to follow a systematic approach, an appropriate methodology based on knowledge how computers work.

Oracle Session Snapper, part 2: Getting most out of Snapper

The main design goal of Session Snapper was that it should not require any changes to be made into database.

And to achieve this goal, I was even willing to sacrifice some functionality.

So, for example there is no sorting capability in Snapper output. It would have been easy to create an SQL Type to database, use that as session statistics storage and query results out using an order by on statistics delta column – giving you (probably) most significant resource consumers first. But I didn’t do this as it would have violated the no-database-change-whatsoever design goal. (This problem could however be solved using manual sorting in PL/SQL code as done in Adrian Billington’s variation of runstats utility: http://www.oracle-developer.net/utilities.php )

The second design goal was that snapper should work with as little privileges as possible. So far it requires execute access only on DBMS_LOCK, DBMS_OUTPUT and read access on few V$ views.

However if you run the snapper using “out” option then it will use DBMS_OUTPUT.PUT_LINE for sending data back to the client. Unfortunately the client (SQLPlus in this case) calls DBMS_OUTPUT.GET_LINES to retrieve the output only when the previous database call has finished. Of course this makes sense, because otherwise some asynchronous call capability to Oracle server would be needed, which either has to interrupt the server process somehow for processing the GET_LINES, or another server process should be started for output feed.

Anyway, we don’t have such functionality in Oracle, so this means that if you run snapper in a loop, you will get the output only when the loop finishes and returns control to sqlplus. This is not good enough for having continuous real-time view of your session(s) performance.

Those who have looked into the header or sourcecode of my Session Snapper script have seen that if you use “trace” option and tail -f on that tracefile, you can get continuous real-time Snapper output to your screen (in addition to having that info saved in logfile). This functionality uses DBMS_SYSTEM.KSDWRT() procedure though. This means again that in order to get real-time output, you need to either grant execute priv on DBMS_SYSTEM to your monitoring user or run Snapper as SYS. Both those options violate the design goals of Snapper and may be unacceptable in real life production systems.

So, to combat that, I wrote a script which still gives us real-time Snapper output without need to access DBMS_SYSTEM.

Oracle 11g internals part 1: Automatic Memory Management

This is my attempt for getting cheap popularity out of recent Oracle 11g release. This is not going to be another Oracle 11g new features list, I’ll be just posting any of my research findings here, in a semi-organized way.

The first post is is about Automatic Memory Management. AMM manages all SGA + PGA memory together, allowing it to shift memory from SGA to PGAs and vice versa. You only need to set a MEMORY_TARGET (and if you like, MEMORY_MAX_TARGET parameter).

You can read rest of the general details from documentation, I will talk about how this feature has been implemented on OSD / OS level (or at least how it looks to be implemented).

When I heard about MEMORY_TARGET , then the first question that came into my mind was that how can Oracle shift shared SGA memory to private PGA memory on Unix? This would mean somehow deallocating space from existing SGA shared memory segment and releasing it for PGA use. To my knowledge the traditional SysV SHM interface is not that flexible that it could downsize and release memory from a single shared memory segment. So I started checking out how Oracle had implemented this.

Oracle Session Snapper – real-time session-level performance stats for DBAs

A post by Jonathan Lewis inspired me to finally complete my version of the Oracle session performance snapper script, which main characteristics are

  1. it reports Oracle session level performance counter and wait information in real time
  2. it does NOT require any database objects to be created

If you are a DBA or consultant working on ad-hoc performance issues, you will like it!

Are you familiar with following situation?

(Monday morning)

Customer: Hey, we need your help! We have serious performance issues in our production environment. It’s a database with x000 online users, several parallel data feeds, continuous batch jobs and reporting going on.

Users have started experiencing occasional bad performance and some batch jobs as well.

You: Ok, lets see it. Do you have any performance monitoring tools installed?

Customer: Yes, we have Xxxxx Xxxxxxxxx installed which shows us a nice green or red light on big screen depending whether there are any problems or not.

You: What color is it showing now?

Customer: Green

You: But you still have the performance problem you described earlier?

Customer: Yes

You: Do you have ANY performance monitoring tools installed?

Customer: Well, we also have a statspack snapshot taken every morning and evening to capture the business workload.

You: Um… Ok… you know forget about it. Can I create a small package to capture some useful performance info on the problematic sessions?

Customer: Yes, but we need to put the DDL scripts through the change review board which gathers every Thursday… but we can’t wait that long!

You: Can we enable tracing?

Customer: Enabling tracing is a change and all changes must go through review board. Also, we don’t really know which exact sessions are affected, it just happens for seemingly random ones… and we can’t just trace every session, can we?

You: Ok, give me a sqlplus window and Excel, we’ll figure something out.

…And now follows a tedious manual SQL execution and copy & paste exercise from various V$ views for getting some meaningful performance information out of Oracle.

Well, not anymore, because The Oracle Session Snapper is in town!

If you know vmstat for Unix, you know it reports you various system level statistic counter deltas over a period you choose.

Well, the Oracle Session Snapper output looks somewhat similar, but it reports you session level deltas of Oracle v$sesstat counters, wait events and starting from 10g also session time model statistics.

All info can be reported in real time, without a need for running and timing multiple SQL scripts and manual calculation of deltas.

And the key unique point of the Session Snapper is – it does not require creation of any database objects, thus no changes in the database at all! Everything is done from within a sqlplus script or anonymous PL/SQL block.

This means that you will be able to get quick session-level performance snapshots even in heavily change-controlled environments, where no object creation whatsoever is allowed without going through a long process.

You can read all the usage details from the script header, but here’s one example of its output:

Tanel@Sol01> @snapper out 1 3 475

-- Session Snapper v1.03 by Tanel Poder (  )

--------------------------------------------------------------------------------------------------------------------------------------------
--        SID, SNAPSHOT START   , SECONDS  , TYPE, STATISTIC                               ,         DELTA,      D/SEC,     HDELTA,   HD/SEC
--------------------------------------------------------------------------------------------------------------------------------------------
DATA,     475, 20070820 01:17:47,         1, STAT, session logical reads                   ,         88232,      88232,     88.23k    88.23k
DATA,     475, 20070820 01:17:47,         1, STAT, consistent gets                         ,         88233,      88233,     88.23k    88.23k
DATA,     475, 20070820 01:17:47,         1, STAT, consistent gets from cache              ,         88232,      88232,     88.23k    88.23k
DATA,     475, 20070820 01:17:47,         1, STAT, calls to get snapshot scn: kcmgss       ,           556,        556,        556       556
DATA,     475, 20070820 01:17:47,         1, STAT, no work - consistent read gets          ,         87677,      87677,     87.68k    87.68k
DATA,     475, 20070820 01:17:47,         1, STAT, table scans (short tables)              ,           139,        139,        139       139
DATA,     475, 20070820 01:17:47,         1, STAT, table scan rows gotten                  ,       7429598,    7429598,      7.43M     7.43M
DATA,     475, 20070820 01:17:47,         1, STAT, table scan blocks gotten                ,         87676,      87676,     87.68k    87.68k
DATA,     475, 20070820 01:17:47,         1, STAT, buffer is pinned count                  ,           138,        138,        138       138
--  End of snap 1
DATA,     475, 20070820 01:17:48,         1, STAT, session logical reads                   ,         87779,      87779,     87.78k    87.78k
DATA,     475, 20070820 01:17:48,         1, STAT, consistent gets                         ,         87772,      87772,     87.77k    87.77k
DATA,     475, 20070820 01:17:48,         1, STAT, consistent gets from cache              ,         87772,      87772,     87.77k    87.77k
DATA,     475, 20070820 01:17:48,         1, STAT, calls to get snapshot scn: kcmgss       ,           552,        552,        552       552
DATA,     475, 20070820 01:17:48,         1, STAT, no work - consistent read gets          ,         87210,      87210,     87.21k    87.21k
DATA,     475, 20070820 01:17:48,         1, STAT, table scans (short tables)              ,           138,        138,        138       138
DATA,     475, 20070820 01:17:48,         1, STAT, table scan rows gotten                  ,       7389897,    7389897,      7.39M     7.39M
DATA,     475, 20070820 01:17:48,         1, STAT, table scan blocks gotten                ,         87211,      87211,     87.21k    87.21k
DATA,     475, 20070820 01:17:48,         1, STAT, buffer is pinned count                  ,           136,        136,        136       136
--  End of snap 2
DATA,     475, 20070820 01:17:49,         1, STAT, session logical reads                   ,         87580,      87580,     87.58k    87.58k
DATA,     475, 20070820 01:17:49,         1, STAT, consistent gets                         ,         87587,      87587,     87.59k    87.59k
DATA,     475, 20070820 01:17:49,         1, STAT, consistent gets from cache              ,         87587,      87587,     87.59k    87.59k
DATA,     475, 20070820 01:17:49,         1, STAT, calls to get snapshot scn: kcmgss       ,           552,        552,        552       552
DATA,     475, 20070820 01:17:49,         1, STAT, no work - consistent read gets          ,         87046,      87046,     87.05k    87.05k
DATA,     475, 20070820 01:17:49,         1, STAT, table scans (short tables)              ,           138,        138,        138       138
DATA,     475, 20070820 01:17:49,         1, STAT, table scan rows gotten                  ,       7375781,    7375781,      7.38M     7.38M
DATA,     475, 20070820 01:17:49,         1, STAT, table scan blocks gotten                ,         87041,      87041,     87.04k    87.04k
DATA,     475, 20070820 01:17:49,         1, STAT, buffer is pinned count                  ,           137,        137,        137       137
--  End of snap 3

PL/SQL procedure successfully completed.

The output contains 3 x 1 second snapshot of session 475 doing heavy nested looping. Note that even though the CPU time used was not updated, the logical IO counts for that session had still increased.

So this tool can be very valuable diagnosing what’s going on when the session seems to be 100% on CPU doing something.

Sqlplus is my second home, Part 1: HTMLizing your sqlplus output

I have not managed to post anything for a while, but I intend to make it up by starting this series of posts made specially for Oracle enthusiasts, geeks and maniacs among us. Here I plan to post the coolest Oracle stuff I’ve just found out and some of it may actually be useful to you!

Lets start. This post is about removing the last major problem with sqlplus in everyday database and application administration work.

I would say the commnd line sqlplus, combined with its script execution and Windows cmd.exe’s command history navigation capabilities, is a very powerful and fast tool for database administration and troubleshooting. This is of course if you use a set of database administration scripts, either downloaded from some reliable source or accumulated over the years of working with Oracle ( you do have such scripts, right? ;-)

Now to the main weakness of sqlplus: I may have scripts carefully formatted for my screen size, however when adding more columns, I run out of screen width. When working on an application data quality troubleshooting task, I need to run some quick ad-hoc queries. Or run a query which just returns lots of data.

What usually happens in such cases is illustrated very well with the output of following query:

Session-level statspack

Statspack is a useful tool for easily gathering and reporting some Oracle’s historical workload statistics. However it has its limitations and problems:

One of them that in past it used to record only Oracle statistics, measured from inside Oracle. This made Oracle and people using Oracle ignore other crucial statistics like OS workload below Oracle. For example statspack may show you heavy latch contention as the performance problem, while looking at vmstat output one would see that the server just has been heavily overloaded with numerous other jobs (like multiple backup, export and compress jobs overrunning their run-window) and the latch contention is just a symptom of CPU starvation.

This has been somewhat addressed by adding V$OSSTAT to Oracle 10g and statspack now gathers this info as well. Now we need people to start looking into those stats.

Another problem with statspack is which is described in Dan Fink’s blog, about cursor-level CPU accounting limitations before 10gR2.

However neither of those problems are the main limitations of statspack. The main problem is that statspack samples it’s statistic snapshots from system level views like V$SYSSTAT and V$SYSTEM_EVENT. If every single session in your database is performing exactly the same (kind) of operation, then system level aggregates might be quite OK to get an overview what’s going on in the database. But from the moment you start having different kind of sessions (e.g. OLTP vs batch vs data feed vs report) in your database, then system level aggregates used by statspack do heavily distort what the reality looks like for specific sessions.

There’s another gotcha – by default statspack excludes the “SQL*Net message from client” wait event from main report section, stating that it is not important to look into idle events. However by that we will throw away an important piece of information when diagnosing end-to-end performance, the end user experience. The bad thing is that even if we did include the SQL*Net wait event in our reports (by deleting the corresponding row from STATS$IDLE_EVENT), then we would still have no idea how many of those gazillions of system-wide SQL*Net client wait seconds do belong to our specific session or user.

Statspack doesn’t measure session-experience, it just gives you a system-wide aggregate, which cannot be translated back to individual session statistics (just as you can not convert a hash value back to original value – most of the information is just lost!).

How to get session-level detailed overview of database performance, with historical reporting capability then?

Oracle 10g addresses this partially with ASH and AWR and DBMS_MONITOR’s selective statistic sampling (Search for V$SERV_MOD_ACT_STATS).

While ASH and AWR are both awesome products, they are not usable in Standard Edition nor before 10g ( you may want to check out Kyle Hailey’s ASH simulator instead ) and for using them you need to pay a separate license fee.

So let me introduce my attempt to solve those problems ( NB! Free stuff!!! ;):

  • The poor-man’s version of Automatic Workload Repository: It’s called… um… Semi-Automatic Workload Repository :)
  • The clever-man’s addition to statspack, which takes session-level snapshots: In other words, Sesspack.

The idea is very simple, just take snapshots of V$SESSION_EVENT and some V$SESSTAT (and V$SESS_TIME_MODEL in 10g) statistics and store those in a repository just like statspack does. Snapshots are taken using sesspack.snap_xyz procedures. As a parameter to these procedures I can pass a specific SID, a group of SIDs, a Oracle db username a OS-username or whatever filtering fields can be found from V$SESSION.

Once you have taken your snapshots when running the workload, you can run a report which calculates statistic and wait event deltas between snapshots and that it.

A demo?

A gotcha with parallel index builds, parallel degree and query plans

Reading the following article about PARALLEL hint by Jonathan Lewis made me remember a somewhat related gotcha with parallelism.
Often when creating (or rebuilding) an index on a large table, doing it with PARALLEL x option makes it go faster – usually in case when your IO subsystem is not the bottleneck and you have enough spare CPU capacity to throw in.
A small example below:
Tanel@Sol01> create table t1 as select * from all_objects; Table created.