Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Oracle 11g internals part 1: Automatic Memory Management

This is my attempt for getting cheap popularity out of recent Oracle 11g release. This is not going to be another Oracle 11g new features list, I’ll be just posting any of my research findings here, in a semi-organized way.

The first post is is about Automatic Memory Management. AMM manages all SGA + PGA memory together, allowing it to shift memory from SGA to PGAs and vice versa. You only need to set a MEMORY_TARGET (and if you like, MEMORY_MAX_TARGET parameter).

You can read rest of the general details from documentation, I will talk about how this feature has been implemented on OSD / OS level (or at least how it looks to be implemented).

When I heard about MEMORY_TARGET , then the first question that came into my mind was that how can Oracle shift shared SGA memory to private PGA memory on Unix? This would mean somehow deallocating space from existing SGA shared memory segment and releasing it for PGA use. To my knowledge the traditional SysV SHM interface is not that flexible that it could downsize and release memory from a single shared memory segment. So I started checking out how Oracle had implemented this.

Oracle 11g internals part 1: Automatic Memory Management

This is my attempt for getting cheap popularity out of recent Oracle 11g release. This is not going to be another Oracle 11g new features list, I’ll be just posting any of my research findings here, in a semi-organized way.

The first post is is about Automatic Memory Management. AMM manages all SGA + PGA memory together, allowing it to shift memory from SGA to PGAs and vice versa. You only need to set a MEMORY_TARGET (and if you like, MEMORY_MAX_TARGET parameter).

You can read rest of the general details from documentation, I will talk about how this feature has been implemented on OSD / OS level (or at least how it looks to be implemented).

When I heard about MEMORY_TARGET , then the first question that came into my mind was that how can Oracle shift shared SGA memory to private PGA memory on Unix? This would mean somehow deallocating space from existing SGA shared memory segment and releasing it for PGA use. To my knowledge the traditional SysV SHM interface is not that flexible that it could downsize and release memory from a single shared memory segment. So I started checking out how Oracle had implemented this.

Oracle Session Snapper – real-time session-level performance stats for DBAs

A post by Jonathan Lewis inspired me to finally complete my version of the Oracle session performance snapper script, which main characteristics are

  1. it reports Oracle session level performance counter and wait information in real time
  2. it does NOT require any database objects to be created

If you are a DBA or consultant working on ad-hoc performance issues, you will like it!

Are you familiar with following situation?

(Monday morning)

Customer: Hey, we need your help! We have serious performance issues in our production environment. It’s a database with x000 online users, several parallel data feeds, continuous batch jobs and reporting going on.

Users have started experiencing occasional bad performance and some batch jobs as well.

You: Ok, lets see it. Do you have any performance monitoring tools installed?

Customer: Yes, we have Xxxxx Xxxxxxxxx installed which shows us a nice green or red light on big screen depending whether there are any problems or not.

You: What color is it showing now?

Customer: Green

You: But you still have the performance problem you described earlier?

Customer: Yes

You: Do you have ANY performance monitoring tools installed?

Customer: Well, we also have a statspack snapshot taken every morning and evening to capture the business workload.

You: Um… Ok… you know forget about it. Can I create a small package to capture some useful performance info on the problematic sessions?

Customer: Yes, but we need to put the DDL scripts through the change review board which gathers every Thursday… but we can’t wait that long!

You: Can we enable tracing?

Customer: Enabling tracing is a change and all changes must go through review board. Also, we don’t really know which exact sessions are affected, it just happens for seemingly random ones… and we can’t just trace every session, can we?

You: Ok, give me a sqlplus window and Excel, we’ll figure something out.

…And now follows a tedious manual SQL execution and copy & paste exercise from various V$ views for getting some meaningful performance information out of Oracle.

Well, not anymore, because The Oracle Session Snapper is in town!

If you know vmstat for Unix, you know it reports you various system level statistic counter deltas over a period you choose.

Well, the Oracle Session Snapper output looks somewhat similar, but it reports you session level deltas of Oracle v$sesstat counters, wait events and starting from 10g also session time model statistics.

All info can be reported in real time, without a need for running and timing multiple SQL scripts and manual calculation of deltas.

And the key unique point of the Session Snapper is – it does not require creation of any database objects, thus no changes in the database at all! Everything is done from within a sqlplus script or anonymous PL/SQL block.

This means that you will be able to get quick session-level performance snapshots even in heavily change-controlled environments, where no object creation whatsoever is allowed without going through a long process.

You can read all the usage details from the script header, but here’s one example of its output:

Tanel@Sol01> @snapper out 1 3 475

-- Session Snapper v1.03 by Tanel Poder (  )

--------------------------------------------------------------------------------------------------------------------------------------------
--        SID, SNAPSHOT START   , SECONDS  , TYPE, STATISTIC                               ,         DELTA,      D/SEC,     HDELTA,   HD/SEC
--------------------------------------------------------------------------------------------------------------------------------------------
DATA,     475, 20070820 01:17:47,         1, STAT, session logical reads                   ,         88232,      88232,     88.23k    88.23k
DATA,     475, 20070820 01:17:47,         1, STAT, consistent gets                         ,         88233,      88233,     88.23k    88.23k
DATA,     475, 20070820 01:17:47,         1, STAT, consistent gets from cache              ,         88232,      88232,     88.23k    88.23k
DATA,     475, 20070820 01:17:47,         1, STAT, calls to get snapshot scn: kcmgss       ,           556,        556,        556       556
DATA,     475, 20070820 01:17:47,         1, STAT, no work - consistent read gets          ,         87677,      87677,     87.68k    87.68k
DATA,     475, 20070820 01:17:47,         1, STAT, table scans (short tables)              ,           139,        139,        139       139
DATA,     475, 20070820 01:17:47,         1, STAT, table scan rows gotten                  ,       7429598,    7429598,      7.43M     7.43M
DATA,     475, 20070820 01:17:47,         1, STAT, table scan blocks gotten                ,         87676,      87676,     87.68k    87.68k
DATA,     475, 20070820 01:17:47,         1, STAT, buffer is pinned count                  ,           138,        138,        138       138
--  End of snap 1
DATA,     475, 20070820 01:17:48,         1, STAT, session logical reads                   ,         87779,      87779,     87.78k    87.78k
DATA,     475, 20070820 01:17:48,         1, STAT, consistent gets                         ,         87772,      87772,     87.77k    87.77k
DATA,     475, 20070820 01:17:48,         1, STAT, consistent gets from cache              ,         87772,      87772,     87.77k    87.77k
DATA,     475, 20070820 01:17:48,         1, STAT, calls to get snapshot scn: kcmgss       ,           552,        552,        552       552
DATA,     475, 20070820 01:17:48,         1, STAT, no work - consistent read gets          ,         87210,      87210,     87.21k    87.21k
DATA,     475, 20070820 01:17:48,         1, STAT, table scans (short tables)              ,           138,        138,        138       138
DATA,     475, 20070820 01:17:48,         1, STAT, table scan rows gotten                  ,       7389897,    7389897,      7.39M     7.39M
DATA,     475, 20070820 01:17:48,         1, STAT, table scan blocks gotten                ,         87211,      87211,     87.21k    87.21k
DATA,     475, 20070820 01:17:48,         1, STAT, buffer is pinned count                  ,           136,        136,        136       136
--  End of snap 2
DATA,     475, 20070820 01:17:49,         1, STAT, session logical reads                   ,         87580,      87580,     87.58k    87.58k
DATA,     475, 20070820 01:17:49,         1, STAT, consistent gets                         ,         87587,      87587,     87.59k    87.59k
DATA,     475, 20070820 01:17:49,         1, STAT, consistent gets from cache              ,         87587,      87587,     87.59k    87.59k
DATA,     475, 20070820 01:17:49,         1, STAT, calls to get snapshot scn: kcmgss       ,           552,        552,        552       552
DATA,     475, 20070820 01:17:49,         1, STAT, no work - consistent read gets          ,         87046,      87046,     87.05k    87.05k
DATA,     475, 20070820 01:17:49,         1, STAT, table scans (short tables)              ,           138,        138,        138       138
DATA,     475, 20070820 01:17:49,         1, STAT, table scan rows gotten                  ,       7375781,    7375781,      7.38M     7.38M
DATA,     475, 20070820 01:17:49,         1, STAT, table scan blocks gotten                ,         87041,      87041,     87.04k    87.04k
DATA,     475, 20070820 01:17:49,         1, STAT, buffer is pinned count                  ,           137,        137,        137       137
--  End of snap 3

PL/SQL procedure successfully completed.

The output contains 3 x 1 second snapshot of session 475 doing heavy nested looping. Note that even though the CPU time used was not updated, the logical IO counts for that session had still increased.

So this tool can be very valuable diagnosing what’s going on when the session seems to be 100% on CPU doing something.

Oracle Session Snapper – real-time session-level performance stats for DBAs

A post by Jonathan Lewis inspired me to finally complete my version of the Oracle session performance snapper script, which main characteristics are

  1. it reports Oracle session level performance counter and wait information in real time
  2. it does NOT require any database objects to be created

If you are a DBA or consultant working on ad-hoc performance issues, you will like it!

Are you familiar with following situation?

(Monday morning)

Customer: Hey, we need your help! We have serious performance issues in our production environment. It’s a database with x000 online users, several parallel data feeds, continuous batch jobs and reporting going on.

Users have started experiencing occasional bad performance and some batch jobs as well.

You: Ok, lets see it. Do you have any performance monitoring tools installed?

Customer: Yes, we have Xxxxx Xxxxxxxxx installed which shows us a nice green or red light on big screen depending whether there are any problems or not.

You: What color is it showing now?

Customer: Green

You: But you still have the performance problem you described earlier?

Customer: Yes

You: Do you have ANY performance monitoring tools installed?

Customer: Well, we also have a statspack snapshot taken every morning and evening to capture the business workload.

You: Um… Ok… you know forget about it. Can I create a small package to capture some useful performance info on the problematic sessions?

Customer: Yes, but we need to put the DDL scripts through the change review board which gathers every Thursday… but we can’t wait that long!

You: Can we enable tracing?

Customer: Enabling tracing is a change and all changes must go through review board. Also, we don’t really know which exact sessions are affected, it just happens for seemingly random ones… and we can’t just trace every session, can we?

You: Ok, give me a sqlplus window and Excel, we’ll figure something out.

…And now follows a tedious manual SQL execution and copy & paste exercise from various V$ views for getting some meaningful performance information out of Oracle.

Well, not anymore, because The Oracle Session Snapper is in town!

If you know vmstat for Unix, you know it reports you various system level statistic counter deltas over a period you choose.

Well, the Oracle Session Snapper output looks somewhat similar, but it reports you session level deltas of Oracle v$sesstat counters, wait events and starting from 10g also session time model statistics.

All info can be reported in real time, without a need for running and timing multiple SQL scripts and manual calculation of deltas.

And the key unique point of the Session Snapper is – it does not require creation of any database objects, thus no changes in the database at all! Everything is done from within a sqlplus script or anonymous PL/SQL block.

This means that you will be able to get quick session-level performance snapshots even in heavily change-controlled environments, where no object creation whatsoever is allowed without going through a long process.

You can read all the usage details from the script header, but here’s one example of its output:

Tanel@Sol01> @snapper out 1 3 475

-- Session Snapper v1.03 by Tanel Poder (  )

--------------------------------------------------------------------------------------------------------------------------------------------
--        SID, SNAPSHOT START   , SECONDS  , TYPE, STATISTIC                               ,         DELTA,      D/SEC,     HDELTA,   HD/SEC
--------------------------------------------------------------------------------------------------------------------------------------------
DATA,     475, 20070820 01:17:47,         1, STAT, session logical reads                   ,         88232,      88232,     88.23k    88.23k
DATA,     475, 20070820 01:17:47,         1, STAT, consistent gets                         ,         88233,      88233,     88.23k    88.23k
DATA,     475, 20070820 01:17:47,         1, STAT, consistent gets from cache              ,         88232,      88232,     88.23k    88.23k
DATA,     475, 20070820 01:17:47,         1, STAT, calls to get snapshot scn: kcmgss       ,           556,        556,        556       556
DATA,     475, 20070820 01:17:47,         1, STAT, no work - consistent read gets          ,         87677,      87677,     87.68k    87.68k
DATA,     475, 20070820 01:17:47,         1, STAT, table scans (short tables)              ,           139,        139,        139       139
DATA,     475, 20070820 01:17:47,         1, STAT, table scan rows gotten                  ,       7429598,    7429598,      7.43M     7.43M
DATA,     475, 20070820 01:17:47,         1, STAT, table scan blocks gotten                ,         87676,      87676,     87.68k    87.68k
DATA,     475, 20070820 01:17:47,         1, STAT, buffer is pinned count                  ,           138,        138,        138       138
--  End of snap 1
DATA,     475, 20070820 01:17:48,         1, STAT, session logical reads                   ,         87779,      87779,     87.78k    87.78k
DATA,     475, 20070820 01:17:48,         1, STAT, consistent gets                         ,         87772,      87772,     87.77k    87.77k
DATA,     475, 20070820 01:17:48,         1, STAT, consistent gets from cache              ,         87772,      87772,     87.77k    87.77k
DATA,     475, 20070820 01:17:48,         1, STAT, calls to get snapshot scn: kcmgss       ,           552,        552,        552       552
DATA,     475, 20070820 01:17:48,         1, STAT, no work - consistent read gets          ,         87210,      87210,     87.21k    87.21k
DATA,     475, 20070820 01:17:48,         1, STAT, table scans (short tables)              ,           138,        138,        138       138
DATA,     475, 20070820 01:17:48,         1, STAT, table scan rows gotten                  ,       7389897,    7389897,      7.39M     7.39M
DATA,     475, 20070820 01:17:48,         1, STAT, table scan blocks gotten                ,         87211,      87211,     87.21k    87.21k
DATA,     475, 20070820 01:17:48,         1, STAT, buffer is pinned count                  ,           136,        136,        136       136
--  End of snap 2
DATA,     475, 20070820 01:17:49,         1, STAT, session logical reads                   ,         87580,      87580,     87.58k    87.58k
DATA,     475, 20070820 01:17:49,         1, STAT, consistent gets                         ,         87587,      87587,     87.59k    87.59k
DATA,     475, 20070820 01:17:49,         1, STAT, consistent gets from cache              ,         87587,      87587,     87.59k    87.59k
DATA,     475, 20070820 01:17:49,         1, STAT, calls to get snapshot scn: kcmgss       ,           552,        552,        552       552
DATA,     475, 20070820 01:17:49,         1, STAT, no work - consistent read gets          ,         87046,      87046,     87.05k    87.05k
DATA,     475, 20070820 01:17:49,         1, STAT, table scans (short tables)              ,           138,        138,        138       138
DATA,     475, 20070820 01:17:49,         1, STAT, table scan rows gotten                  ,       7375781,    7375781,      7.38M     7.38M
DATA,     475, 20070820 01:17:49,         1, STAT, table scan blocks gotten                ,         87041,      87041,     87.04k    87.04k
DATA,     475, 20070820 01:17:49,         1, STAT, buffer is pinned count                  ,           137,        137,        137       137
--  End of snap 3

PL/SQL procedure successfully completed.

The output contains 3 x 1 second snapshot of session 475 doing heavy nested looping. Note that even though the CPU time used was not updated, the logical IO counts for that session had still increased.

So this tool can be very valuable diagnosing what’s going on when the session seems to be 100% on CPU doing something.

Sqlplus is my second home, Part 1: HTMLizing your sqlplus output

I have not managed to post anything for a while, but I intend to make it up by starting this series of posts made specially for Oracle enthusiasts, geeks and maniacs among us. Here I plan to post the coolest Oracle stuff I’ve just found out and some of it may actually be useful to you!

Lets start. This post is about removing the last major problem with sqlplus in everyday database and application administration work.

I would say the commnd line sqlplus, combined with its script execution and Windows cmd.exe’s command history navigation capabilities, is a very powerful and fast tool for database administration and troubleshooting. This is of course if you use a set of database administration scripts, either downloaded from some reliable source or accumulated over the years of working with Oracle ( you do have such scripts, right? ;-)

Now to the main weakness of sqlplus: I may have scripts carefully formatted for my screen size, however when adding more columns, I run out of screen width. When working on an application data quality troubleshooting task, I need to run some quick ad-hoc queries. Or run a query which just returns lots of data.

What usually happens in such cases is illustrated very well with the output of following query:

Sqlplus is my second home, Part 1: HTMLizing your sqlplus output

I have not managed to post anything for a while, but I intend to make it up by starting this series of posts made specially for Oracle enthusiasts, geeks and maniacs among us. Here I plan to post the coolest Oracle stuff I’ve just found out and some of it may actually be useful to you!

Lets start. This post is about removing the last major problem with sqlplus in everyday database and application administration work.

I would say the commnd line sqlplus, combined with its script execution and Windows cmd.exe’s command history navigation capabilities, is a very powerful and fast tool for database administration and troubleshooting. This is of course if you use a set of database administration scripts, either downloaded from some reliable source or accumulated over the years of working with Oracle ( you do have such scripts, right? ;-)

Now to the main weakness of sqlplus: I may have scripts carefully formatted for my screen size, however when adding more columns, I run out of screen width. When working on an application data quality troubleshooting task, I need to run some quick ad-hoc queries. Or run a query which just returns lots of data.

What usually happens in such cases is illustrated very well with the output of following query:

Session-level statspack

Statspack is a useful tool for easily gathering and reporting some Oracle’s historical workload statistics. However it has its limitations and problems:

One of them that in past it used to record only Oracle statistics, measured from inside Oracle. This made Oracle and people using Oracle ignore other crucial statistics like OS workload below Oracle. For example statspack may show you heavy latch contention as the performance problem, while looking at vmstat output one would see that the server just has been heavily overloaded with numerous other jobs (like multiple backup, export and compress jobs overrunning their run-window) and the latch contention is just a symptom of CPU starvation.

This has been somewhat addressed by adding V$OSSTAT to Oracle 10g and statspack now gathers this info as well. Now we need people to start looking into those stats.

Another problem with statspack is which is described in Dan Fink’s blog, about cursor-level CPU accounting limitations before 10gR2.

However neither of those problems are the main limitations of statspack. The main problem is that statspack samples it’s statistic snapshots from system level views like V$SYSSTAT and V$SYSTEM_EVENT. If every single session in your database is performing exactly the same (kind) of operation, then system level aggregates might be quite OK to get an overview what’s going on in the database. But from the moment you start having different kind of sessions (e.g. OLTP vs batch vs data feed vs report) in your database, then system level aggregates used by statspack do heavily distort what the reality looks like for specific sessions.

There’s another gotcha – by default statspack excludes the “SQL*Net message from client” wait event from main report section, stating that it is not important to look into idle events. However by that we will throw away an important piece of information when diagnosing end-to-end performance, the end user experience. The bad thing is that even if we did include the SQL*Net wait event in our reports (by deleting the corresponding row from STATS$IDLE_EVENT), then we would still have no idea how many of those gazillions of system-wide SQL*Net client wait seconds do belong to our specific session or user.

Statspack doesn’t measure session-experience, it just gives you a system-wide aggregate, which cannot be translated back to individual session statistics (just as you can not convert a hash value back to original value – most of the information is just lost!).

How to get session-level detailed overview of database performance, with historical reporting capability then?

Oracle 10g addresses this partially with ASH and AWR and DBMS_MONITOR’s selective statistic sampling (Search for V$SERV_MOD_ACT_STATS).

While ASH and AWR are both awesome products, they are not usable in Standard Edition nor before 10g ( you may want to check out Kyle Hailey’s ASH simulator instead ) and for using them you need to pay a separate license fee.

So let me introduce my attempt to solve those problems ( NB! Free stuff!!! ;):

  • The poor-man’s version of Automatic Workload Repository: It’s called… um… Semi-Automatic Workload Repository :)
  • The clever-man’s addition to statspack, which takes session-level snapshots: In other words, Sesspack.

The idea is very simple, just take snapshots of V$SESSION_EVENT and some V$SESSTAT (and V$SESS_TIME_MODEL in 10g) statistics and store those in a repository just like statspack does. Snapshots are taken using sesspack.snap_xyz procedures. As a parameter to these procedures I can pass a specific SID, a group of SIDs, a Oracle db username a OS-username or whatever filtering fields can be found from V$SESSION.

Once you have taken your snapshots when running the workload, you can run a report which calculates statistic and wait event deltas between snapshots and that it.

A demo?

Session-level statspack

Statspack is a useful tool for easily gathering and reporting some Oracle’s historical workload statistics. However it has its limitations and problems:

One of them that in past it used to record only Oracle statistics, measured from inside Oracle. This made Oracle and people using Oracle ignore other crucial statistics like OS workload below Oracle. For example statspack may show you heavy latch contention as the performance problem, while looking at vmstat output one would see that the server just has been heavily overloaded with numerous other jobs (like multiple backup, export and compress jobs overrunning their run-window) and the latch contention is just a symptom of CPU starvation.

This has been somewhat addressed by adding V$OSSTAT to Oracle 10g and statspack now gathers this info as well. Now we need people to start looking into those stats.

Another problem with statspack is which is described in Dan Fink’s blog, about cursor-level CPU accounting limitations before 10gR2.

However neither of those problems are the main limitations of statspack. The main problem is that statspack samples it’s statistic snapshots from system level views like V$SYSSTAT and V$SYSTEM_EVENT. If every single session in your database is performing exactly the same (kind) of operation, then system level aggregates might be quite OK to get an overview what’s going on in the database. But from the moment you start having different kind of sessions (e.g. OLTP vs batch vs data feed vs report) in your database, then system level aggregates used by statspack do heavily distort what the reality looks like for specific sessions.

There’s another gotcha – by default statspack excludes the “SQL*Net message from client” wait event from main report section, stating that it is not important to look into idle events. However by that we will throw away an important piece of information when diagnosing end-to-end performance, the end user experience. The bad thing is that even if we did include the SQL*Net wait event in our reports (by deleting the corresponding row from STATS$IDLE_EVENT), then we would still have no idea how many of those gazillions of system-wide SQL*Net client wait seconds do belong to our specific session or user.

Statspack doesn’t measure session-experience, it just gives you a system-wide aggregate, which cannot be translated back to individual session statistics (just as you can not convert a hash value back to original value – most of the information is just lost!).

How to get session-level detailed overview of database performance, with historical reporting capability then?

Oracle 10g addresses this partially with ASH and AWR and DBMS_MONITOR’s selective statistic sampling (Search for V$SERV_MOD_ACT_STATS).

While ASH and AWR are both awesome products, they are not usable in Standard Edition nor before 10g ( you may want to check out Kyle Hailey’s ASH simulator instead ) and for using them you need to pay a separate license fee.

So let me introduce my attempt to solve those problems ( NB! Free stuff!!! ;):

  • The poor-man’s version of Automatic Workload Repository: It’s called… um… Semi-Automatic Workload Repository :)
  • The clever-man’s addition to statspack, which takes session-level snapshots: In other words, Sesspack.

The idea is very simple, just take snapshots of V$SESSION_EVENT and some V$SESSTAT (and V$SESS_TIME_MODEL in 10g) statistics and store those in a repository just like statspack does. Snapshots are taken using sesspack.snap_xyz procedures. As a parameter to these procedures I can pass a specific SID, a group of SIDs, a Oracle db username a OS-username or whatever filtering fields can be found from V$SESSION.

Once you have taken your snapshots when running the workload, you can run a report which calculates statistic and wait event deltas between snapshots and that it.

A demo?

A gotcha with parallel index builds, parallel degree and query plans

Reading the following article about PARALLEL hint by Jonathan Lewis made me remember a somewhat related gotcha with parallelism.
Often when creating (or rebuilding) an index on a large table, doing it with PARALLEL x option makes it go faster – usually in case when your IO subsystem is not the bottleneck and you have enough spare CPU capacity to throw in.
A small example below:
Tanel@Sol01> create table t1 as select * from all_objects; Table created.

A gotcha with parallel index builds, parallel degree and query plans

Reading the following article about PARALLEL hint by Jonathan Lewis made me remember a somewhat related gotcha with parallelism.
Often when creating (or rebuilding) an index on a large table, doing it with PARALLEL x option makes it go faster – usually in case when your IO subsystem is not the bottleneck and you have enough spare CPU capacity to throw in.
A small example below:
Tanel@Sol01> create table t1 as select * from all_objects; Table created.