Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Oracle Wait Event reference

Kyle Hailey has started putting together a much needed Oracle wait event reference.

You can access it here.

By the way, Oracle documentation also has a wait event reference section, it has more events, but it’s less detailed…

I have plans to go deep into some wait events and cover some less common ones in tech.E2SN too… in the future ;-)

Share/Bookmark

Scalar Subqueries

Better formatted at http://tinyurl.com/yfrjbwx
Query 2
The VST diagram looks like
There are two interesting things about this diagram.
Every thing is OUTER JOINED to F_OUTER
There are correlated subqueries in the select

There are two things to check - what is the effect of the OUTER JOINS. The OUTER JOINS can easily mean that all joins into F_OUTER don't change the result set size. Let's confirm by looking at the TTJ sizes:

The only thing that bounds the number of rows returned by F_OUTER is it's self join (the line going in and out of F_OUTER) on the bottom left of F_OUTER.
What this means is that it doesn't matter what order we join tables into F_OUTER.
Now we can turn to the other interesting thing about the query. There are 4 correlated subselects in the select clause. These queries in the select clause are called "scalar subqueries." These scalar subqueries can be a bad idea or a good idea depending on how mainly on how many distinct values are used as input into the them. AFAIK, Oracle doesn't merge subqueries in the select, and certainly not the ones in this query because they are embedded in cases statements - looks like I might have spoken too soon! more reseach to do but looks like Oracle can merge these scalar subqueries even with the case statement. I will try to run some more tests . To be continued)
In the worst case scenario the scalar subqueries are going to be run 845,012 times - that is one heck of a lot of work which would be a BAD IDEA.

Looking at the above diagram, and the 4 scalar subqueries in the select clause, the top red number is how many times the scalar subquery will be run (based on the case statement in the select clause) and the orange highlight is how many distinct values will be used in the scalar subquery where clause. P3 will benefit for scalar subquery caching but F won't because there are too many distinct values. On the other hand for P1 and P2 could if there are no collisions (see CBOF p216-217) and the scalar subquery caching is actually supports 1024 values. ( see http://www.oratechinfo.co.uk/scalar_subqueries.html#scalar3 for a great write up on analysis of scalar subquery caching - the analysis on this page seems to show that caching maxes out way before 1024 but that might be because of collisions)

The subqueries in the select clause look like

select CASE WHEN F.f1 IS NULL
THEN NULL
ELSE (SELECT X.f2
FROM X
WHERE code_vl = F.f1)
END AS f0
from F;

and could be merged into the query like:


select CASE WHEN F.f1 IS NULL
THEN NULL
ELSE ( X.f2)
END AS f0
from F , X
where code_vl(+) = F.f1;


( NOTE: the first query will break if the correlated sub query returns more than one value where as the second query will return the mulitple rows.)

The VST diagram can't tell you that this is the solution, but they can point out that the place to spend your time is the subqueries in the select.

Index Block Dumps and Index Tree Dumps Part I: (Knock On Wood)

I thought before I jump into a topic that requires looking at a number of index block dumps, it might be worth briefly recapping how one goes about dumping index blocks in Oracle.   A block dump is simply a formatted representation of the contents of a particular Oracle database block.  Although I’ll be focusing [...]

Oracle: SQL*Net Waits

article better formatted at


Introduction

Unfortunately, what Oracle calls "Network Waits" have little to do with Network but and almost exclusively to do with the time it takes to pack messeges for the network before they are sent.
Client = you, the tool, sqlplus, application
the shadow process is communicating to the client

Of the three waits, only "more data" is possibly related to network issues and that's not even clear, the other two are simply the time it takes to pack a message before sending it.

SQL*Net message to client - time to pack a message (no network time included) possibly tune SDU
SQL*Net more data from client - possible network issues, possibly tune SDU
SQL*Net more data to client - time to pack a message (no network time included) possibly tune SDU
The same events exist, but where the client is the shadow process and another database plays the roll of shadow process:
SQL*Net message to dblink
SQL*Net more data from dblink - possible network issues, possibly tune SDU
SQL*Net more data to dblink

SQL*Net Wait Events

SQL*Net message from client

Idle Event
Waiting for work from Client
Includes network transmission times for messages coming from shadow
Typically indicative of Client “think time” or “processing time”

SQL*Net message to client

Time it takes to pack a message to be sent to the client

Doesn’t include network timing
see Tanel Poder's analysis of SQL*Net message to client

SQL*Net more data to client

Same as SQL*Net message to client except this is for data that spans SDU packets.

Wait represents the time it takes to pack data.
Doesn’t include network timing

SQL*Net more data from client

The only SQL*Net wait that can indicate a possible NETWORK problem
Client is sending data to shadow that spans packets (think large data inserts, possibly large code blocks, large SQL statements)
Shadow waits for next packet.
Can indicate network latency.
Can indicate a problem with the client tool
Here is an example with ASHMON where the application server died mid-stream on inserts. The shadow processes were left waiting for completion of the message. You can see the regular load on the database on the left, then just past the middle the load crashes, and all that's left is waits on "SQL*Net more data from client"

Possibly set SDU=32768 as well as setting RECV_BUF_SIZE and SEND_BUF_SIZE to 65536.

SQL*Net break/reset to client

Error in sql statement

Control C
Usually highlights and error in application
Example:
CREATE TABLE T1 (C1 NUMBER);
ALTER TABLE T1 ADD
(CONSTRAINT T1_CHECK1 CHECK (C1 IN ('J','N')));
ALTER SESSION SET EVENTS
'10046 TRACE NAME CONTEXT FOREVER, LEVEL 12';
INSERT INTO T1 VALUES (1);
Trace File
PARSING IN CURSOR #2 len=25 dep=0 uid=0 oct=2 lid=0 tim=5009300581224 hv=9816834
09 ad='8e6a7c10'
INSERT INTO T1 VALUES (1)
END OF STMT
PARSE #2:c=0,e=2770,p=0,cr=2,cu=0,mis=1,r=0,dep=0,og=1,tim=5009300581220
BINDS #2:
EXEC #2:c=0,e=128,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=5009300581418
ERROR #2:err=1722 tim=512952379
WAIT #2: nam='SQL*Net break/reset to client' ela= 31 driver id=1650815232 break?
=1 p3=0 obj#=-1 tim=5009300581549
WAIT #2: nam='SQL*Net break/reset to client' ela= 92 driver id=1650815232 break?
=0 p3=0 obj#=-1 tim=5009300581662

DBLINK SQL*Net Waits

These waits are the same as
SQL*Net message to dblink
SQL*Net more data from dblink
SQL*Net more data to dblink
SQL*Net break/reset to dblink

Analysis and Tuning

There isn't much to do on the Oracle side for tuning. You can try optimizing the SDU and SEND_BUF_SIZE and RECV_BUF_SIZE.
For actually getting information on network speeds you will have to use something like
  • ping
  • tnsping
  • network sniffe

SDU

The default SDU can be set in the sqlnet. ora
If it's not set, the default is 2048
The max is 32768
The default,or the value in sqlnet.ora, can be overridden in the tnsnames. ora and the listener.ora. The client and server negotiate the size aggreeing on the smaller of the two settings.
(TDU – Transmission Data Unit – see note 44694.1 The TDU parameter has been deprecated in the Oracle Net v8.0 and beyond and is ignored. It is only mentioned here for backward compatibility.)
tnsnames.ora
V10G = (DESCRIPTION =
(SDU=32768)
(ADDRESS = (PROTOCOL = TCP)(HOST = fuji)(PORT = 1522))
(CONNECT_DATA =
(SERVER = DEDICATED) (SERVICE_NAME = v10g)
) )
listener.ora
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(SDU=32768)
(SID_NAME = v10g)
(ORACLE_HOME = /export/home/oracle10)
))

Tracing

sqlnet.ora

trace_level_client=16

trace_directory_client=/tmp

trace_file_client=client.trc

trace_unique_client = true

trace_level_server=16

trace_directory_server=/tmp

trace_file_server=server.trc

client.trc

client_3582.trc:[12-JAN-2008 11:37:39:237] nsconneg: vsn=313, gbl=0xa01, sdu=32768, tdu=32767

RECV_BUF_SIZE and SEND_BUF_SIZE

The recommended size for these buffers (from Oracle's docs) is at least
Network bandwidth * roundtrip = buffer min size
For example if the network bandwidth is 100mbs and the round trip time (from ping) is 5ms then
100,000,000 bits 1 byte 5 seconds
---------------- x ------ x --------- = 62,500 bytes
1 second 8 bits 1000
tnsnames.ora
V10G = (DESCRIPTION =
(SEND_BUF_SIZE=65536)
(RECV_BUF_SIZE=65536)
(ADDRESS = (PROTOCOL = TCP)(HOST = fuji)(PORT = 1522))
(CONNECT_DATA =
(SERVER = DEDICATED) (SERVICE_NAME = v10g)
) )

listener.ora
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(SEND_BUF_SIZE=65536)
(RECV_BUF_SIZE=65536)
(SID_NAME = v10g)
(ORACLE_HOME = /export/home/oracle10)
))

sqlnet.ora
RECV_BUF_SIZE=65536
SEND_BUF_SIZE=65536

Automated Root Cause Analysis

I’ve ran into multiple products that claim to offer automated root cause analysis, so don’t think that I’m ranting against a specific product or vendor. I have a problem with the concept.

The problem these products are trying to solve: IT staff spend much of their time trying to troubleshoot issues. Essentially finding the cause of effects they don’t like. What is causing high response times on this report? What is causing the lower disk throughputs?

If we can somehow automate the task of finding a cause for a problem, we’ll have a much more efficient IT department.

The idea that troubleshooting can be automated is rather seductive. I’d love to have a “What is causing this issue” button. My problem is with the way those vendors go about solving this issue.

Most of them use variations of a very similar technique:
All these vendors already have monitoring software, so they usually know when there is a problem. They also know of many other things that happen at the same time. So if their software detects that response time go up, it can look at disk throughput, DB cpu, swap, load average, number of connections, etc etc.
When they see that CPU goes up together with response times – Tada! Root cause found!

First problem with this approach: You can’t look at correlation and declare that you found a cause. Enough said.

Second problem: If you collect so much data (and often these systems have millions of measurements) you will find many correlation by pure chance, in addition to some correlations that do indicate a common issue.
What these vendors do is ignore all the false findings and present the real problems found at a conference as proof that their method works. Also, you can’t reduce the rate of false-findings without losing the rate of finding real issues as well.

Note that I’m not talking about tools like Tanel Poder visualization tool. Tools which makes it easier for the DBA to look at large amounts of data and using our brain’s built in pattern matcher to find correlations. I support any tool that assists me in applying my knowledge to large sets of data at once.

I have a problem with tools that use statistical correlation as a replacement to applying knowledge. It can’t be done.

Here’s the kind of tool I’d like to see:
Suppose your monitoring tool will give you the ability to visually browse, filter and explore all that data you collect in ways that help you troubleshoot. The tool will remember the things you looked at and the steps you took. After you solve the problem, you can upload the problem description and your debug process to a website. You can even mark away the dead-ends of the investigation.

Now you can go to that website and see that for problem X, 90% of the DBAs started by looking at v$sesstat and 10% ran trace. Maybe you can even have a friend network, so you can see that in this case Fuad looked at disk utilization first while Iggy checked how much redo is written each hour.

If you are not into sharing, you can still browse your own past problems and solutions for ideas that might have slipped your mind.

I think that a troubleshooting tool combined with “collective wisdom” site can assist experienced DBAs and improve the learning curve for junior DBAs without pretending to automate away knowledge and experience.

Data Access APIs–Part 1: Fun with UPI

First, I’d like to apologize to our good friend SQLLIB.  Those of you who have been working with the Oracle Database for some time will notice that, while it too is a common data access library, I’ve omitted it from this series of posts. No, it’s not because of some personal vendetta against SQLLIB.  In […]

a formula for failure (or an expensive redesign)

If 'Premature optimization is the root of all evil.'then 'Premature automation is the propagator of many evils.'else 'Failure to optimize is the abyss.'end;

Excited about NoCOUG Winter Conference

NoCOUG is hosting its winter conference next week – On February 11th.
As usual, we’ll have the best speakers and presentations ever. This time I’m extra happy because two of the speakers that are going to be there, Dr. Neil Gunther and Robyn Sands, are there because I was wowed by them in a previous conference and asked our Director of Conference Programming to invite them. And they agreed! I believe it is the first time that either of them presents at NoCOUG and I’m very excited about this.

I’m sure I don’t need to introduce Robyn Sands to any Oracle professional – She’s an OakTable member who talks a lot about the right ways to manage performance. She is very scientific and precise but she gives very practical advice that is very applicable.

Dr. Neil Gunther is a well known performance expert. So well known that he has his own Wikipedia article. I first ran into his work when I did performance testing work, something like 6 years ago. From his articles, I learned the importance of having performance models without which you cannot interpret your results and know when your tests were faulty. I ran into him again when Tanel Poder mentioned that Dr. Neil Gunther is now doing work that will be relevant to Oracle professionals. He appeared in HotSos few years back and now we get to see him at NoCOUG – with both a keynote session and a technical session. He invited the crowds to ask questions at his blog, so you can participate.

In addition to these two prestigious names, we have few local celebrities giving presentations: Ahbaid Gaffoor, lead DBA at Amazon, will show his make-based deployment methodology. If you don’t have a deployment methodology, this presentation is a must-see. Maria Colgan will give a presentation about data loading for data warehouses. Although she’s an Oracle presenter, which sometimes means “marketing”, Maria is smart and knowledgeable and if you are doing data warehouse work – she is worth listening to.

I’ll be presenting “What Every DBA Should Know About TCP/IP Networks”. The presentation is about network problems I’ve had to solve in the last year and how I solved them with some basic knowledge of networks, a packet sniffer and an envelope. If you ever wondered how to make your network admin take you seriously, how to get more bang from your bandwidth and whether or not you should care about your SDU, you should definitely show up.

I’m looking forward to meeting old and new friends in the conference. Its going to be a blast.

DEVCON Luzon 2010

I just recently I became a member of the PSIA Tech Council… The company I’m working for is a member of PSIA which makes up 90% of the country’s software sector promoting the growth and global competitiveness of the Philippine software industry, also an active partner of the government and academe in implementing programs that benefit the industry.

The PSIA, PSIA Tech Council, together with the Awesome and Cool sponsors will be having the Luzon leg of DEVCON here in Manila!

Below are the details of this awesome event:

09 February 2010, 4-9pm, SMX Convention Center Function Room 1

Sync. Support. Succeed.

Get together to be connected, enhance skills and support each other to achieve success!

Designed to be a premier gathering of all Filipino software engineers, DEVCON facilitates collaboration, interaction and mentoring among leading practitioners of the Philippine software industry. DEVCON adapts global best practices for skills improvement and professional advancement among Filipino software engineers. It features three main elements which has successful formats used in international technology gatherings:

> Lightning Talks – a fast-paced presentation on any topic of interest
> Birds of a Feather – a dynamic discussion of opposing perspectives on mutual topics
> Hackathon – providing rapid learning of a new technology through hands-on demonstration or joint coding onsite

Register online for your FREE seat. </p />
</p></div>

    	  	<div class=

Oracle Peformance Visualization…

Coskan Gundogar and Karl Arao have written two interesting articles about Oracle performance analysis and visualization, check these out!

Coskan’s article:

Karl’s article:

Note that in March I will be releasing PerfSheet v3.0, which will have lots of improvements! ;-)

Share/Bookmark