Search

Top 60 Oracle Blogs

Recent comments

December 2009

RAC performance tuning: Understanding Global cache performance

Global cache performance metrics are not correctly measured. It is not understood clearly either. There are even few blogs and web pages disseminating incorrect information. This blog entry is an attempt to offer few methods and scripts to understand global cache performance.

Always review all instances

It is very important to review the performance metrics from all instances in that RAC cluster, not just one instance that you are connected. If you have access to AWR reports, then it is critical to generate AWR reports (or statspack reports) from all instances. But, the problem is that, DBAs tend to generate AWR reports after logging in to each instance iteratively, enter couple of parameters and then reports are generated. Not exactly a convenient practice.

  REM connect to each instance separately, type in the beginning snap_id and ending snap_id for each node etc..
   sqlplus mydba@proddb1
   @$ORACLE_HOME/rdbms/admin/awrrpt.sql
   exit;
   sqlplus mydba@proddb2
   @$ORACLE_HOME/rdbms/admin/awrrpt.sql
   exit;
   sqlplus mydba@proddb3
   @$ORACLE_HOME/rdbms/admin/awrrpt.sql
   exit;

There are few issues with this approach. It is a cumbersome practice if the instance count is higher. In addition to that, all of AWR reports are, in turn, accessing underlying AWR tables. Physically, rows from all instances are together in the same block and so, by executing these reports connecting to various instances, Global cache traffic is increased. If the database is suffering from Global cache (GC) performance issues then generating reports connecting to various instances is probably not a grand idea.

I have created few basic scripts to generate AWR reports from all instances in a single step. Script awrrpt_all_gen.sql queries AWR tables and gv$instance to access last snap_id and various instance_ids. Then this script generates AWR reports for all instances in that cluster in the current directory, for the last AWR snap. This script is quite handy while debugging RAC performance issues. Now, we have AWR reports from all instances in one directory and we can easily grep these files to understand metrics in various instances.

@awrrpt_all_gen.sql

awrrpt_all_gen.sql v1.00 by Riyaj Shamsudeen @orainternals.com

To generate AWR Report from all RAC instances concurrently.
Creates reports using last two snap_ids.

...Generating awrrpt_all.sql script.... Please wait....
...Completed script generation.

Executing awrrpt_all.sql to generate AWR reports.
...Generates AWR reports with file name format awrrpt___.txt for each instance.
...Please wait for few minutes...

...AWR report created for instance 1. Please wait..
...AWR report created for instance 2. Please wait..
...AWR report created for instance 3. Please wait..
...AWR report created for instance 4. Please wait..

AWR reports created.

If you would like ability to choose the snap_ids, then use the script awrrpt_all_range_gen.sql

Use granular approach to GC measurements

It is important to measure Global cache performance metrics correctly, especially those damned averages. Incorrect measurements of averages can lead to faulty analysis. For example, following query is retrieving average global cache CR receive time from gv$sysstat view. Problem with this SQL statement is that this statement is retrieving average global cache receive time from the restart of the instances. Global cache performance is quite sensitive to workload and querying over a wide time range, as in this query, can lead to faulty analysis.


REM This is really a bad idea!!

select b1.inst_id, b2.value "RECEIVED",
b1.value "RECEIVE TIME",
((b1.value / b2.value) * 10) "AVG RECEIVE TIME (ms)"
from gv$sysstat b1, gv$sysstat b2
where b1.name = 'gc current block receive time' and
b2.name = 'gc current blocks received' and b1.inst_id = b2.inst_id;

As an analogy, above query is equivalent to a patient-Doctor conversation:

Patient: Doctor, I don’t feel good. I feel that I am having fever.
Doctor : Nope. Your average body temperature from the beginning of this year is around 98. So, you are doing excellent.

It is essential to review the global cache performance metrics at a granular level. Use AWR reports or statspack reports for a short period of time. Better yet, if you want to quickly review GC performance metrics, you can use my script gc_traffic_print.sql. This script captures various statistics relevant to Global cache performance in to a PL/SQL table, sleeps for N seconds (default N=60) and re-captures the statistics after the sleep. Then the script prints all the relevant details in easily readable tabular format. Very useful in quickly reviewing the Global cache performance metrics.

SQL> @gc_traffic_print.sql

gc_traffic_print.sql v1.20 by Riyaj Shamsudeen @orainternals.com

...Calculating GC Rx and Tx timing and blocks..
...Default collection period is 60 seconds.... Please wait for at least 60 seconds...
Enter value for sleep: 30
---------|--------------|---------|----------------|----------|---------------|---------------|-------------|
Inst     | CR blocks Rx | CR time |  CUR blocks Rx | CUR time |  CR blocks Tx | CUR blocks Tx |Tot blocks   |
---------|--------------|---------|----------------|----------|---------------|---------------|-------------|
1        |         29288|    12.17|          14914|      4.58|          27059|           17439|        88700|
2        |         21201|     5.75|          12835|      5.19|          26831|           16741|        77608|
3        |         22942|     4.64|          11751|      3.82|          23918|            9205|        67816|
4        |         32075|     4.78|          19260|      4.55|          31046|           19331|       101712|
---------|--------------|---------|----------------|----------|---------------|---------------|-------------|

PL/SQL procedure successfully completed.

Column key for the above script is:
CR block RX : GC CR blocks received
CR time : Average GC CR receive time
CUR blocks RX: GC CUR blocks received
CUR time: Average GC CuR receive time
CR blocks TX: GC CR blocks transmitted
CUR blocks TX: GC CUR blocks transmitted
tot blocks: Sum of transmitted + received for both CR and CUR traffic

Breakup: global cache performance

Global cache processing performance can be broken down to its sub components ( excluding wire latency, but that can be calculated). Breakup of this metrics must be done at a granular level also. We can use yet another script gc_traffic_processing.sql to review the performance metrics for the past N seconds. For example, in the output printed below, Global cache CR processing time is broken in to various sub-components of global cache processing. Remember that these performance metrics are in the SEND side of Global cache performance metrics. Meaning, inefficient global cache processing in one instance affects other instances' global cache receive time, but not the self. For example, in the table above, GC CR receive performance was at 12.1ms for instance 1. That is due to another instance (probably node 4 in this case) suffering from high GC processing time (more specifically high 'CR block flush time' in node 4). It is far more accurate to review global cache performance issues in a granular basis. Caution though, try not to specify at least a minute so that these numbers are consistent.

@gc_traffic_procssing.sql
gc_traffic_processing.sql v1.00 by Riyaj Shamsudeen @orainternals.com

...Prints various timing related information for the past N seconds
...Default collection period is 60 seconds.... Please wait for at least 60 seconds...
Enter value for sleep:

---------|-----------|---------|-----------|----------|------------|------------|------------|----------|
Inst     | CR blk Tx | CR bld  | CR fls tm | CR snd tm| CUR blk TX | CUR pin tm | CUR fls tm |CUR snd tm|
---------|-----------|---------|-----------|----------|------------|------------|------------|----------|
1        |      62732|       .2|       1.25|       .21|       48920|        .96|         .22|        .22|
2        |      46037|      .13|        .76|        .2|       42019|        .68|         .17|         .2|
3        |      72237|      .63|        .79|       .19|       22697|        .44|         .29|        .23|
4        |     113816|      .13|       2.04|       .15|       51020|        .87|         .12|        .13|
--------------------------------------------------------------------------------------------------------

PL/SQL procedure successfully completed.

Here is the column definitions for the script output:
CR blk TX: GC CR blocks served
CR bld : Average GC CR build time/CR block served
CR fls tm: Average GC CR flush time/CR block served
CR snd tm: Average GC CR send time/CR block served
CUR blk TX: GC CUR blocks served
CUR pin tm: Average GC Current pin time /CUR block served
CUR fls tm : Average GC Current flush time/CUR block served
CUR snd tm: Average GC current send time/CUR block served

Review histograms for GC events

Averages can be misleading. Revealing performance issues underneath the average is a very important task to do. For example, we see that 'gc cr block receive time' average is high. But, let's review the histogram for the gc events related to Global cache receive time. My script gc_event_histogram.sql is quite handy to review event histograms. This script provides a difference in performance statistics between two samples and prints them in a readable format. In this case, node 1 is has 8578 waits for the event 'gc cr block 2-way' under 2ms and the sessions in instance 1 had 9851 waits between 2-4ms. Only few waits over 16ms.

@gc_event_histogram.sql

gc_event_histogram.sql v1.0 by Riyaj Shamsudeen @orainternals.com

Default collection period is sleep seconds. Please wait..
Enter value for event: gc cr block 2-way
Enter value for sleep: 60
---------|-----------------------|----------------|----------|
Inst id  | Event                 |wait time milli |wait cnt  |
---------|-----------------------|----------------|----------|
1        |gc cr block 2-way      |               1|        98|
1        |gc cr block 2-way      |               2|      8578|
1        |gc cr block 2-way      |               4|      9851|
1        |gc cr block 2-way      |               8|      2291|
1        |gc cr block 2-way      |              16|       117|
1        |gc cr block 2-way      |              32|         6|
1        |gc cr block 2-way      |              64|         1|
1        |gc cr block 2-way      |             128|         2|
1        |gc cr block 2-way      |             256|         0|
1        |gc cr block 2-way      |             512|         0|
1        |gc cr block 2-way      |            1024|         0|
1        |gc cr block 2-way      |            2048|         0|
1        |gc cr block 2-way      |            4096|         0|
1        |gc cr block 2-way      |            8192|         0|
1        |gc cr block 2-way      |           16384|         0|
2        |gc cr block 2-way      |               1|       155|
2        |gc cr block 2-way      |               2|     10792|
2        |gc cr block 2-way      |               4|     14201|
2        |gc cr block 2-way      |               8|      3887|
2        |gc cr block 2-way      |              16|       204|
...

But, we also need to see 3-way waits. Let's review 3-way waits with the above script. We will simply supply 'gc cr block 3-way' as the wait event. In this case, we see that there are 12,062 waits over between 4-8ms. Of course, this must be improved.


@gc_event_histogram.sql

gc_event_histogram.sql v1.0 by Riyaj Shamsudeen @orainternals.com

Default collection period is sleep seconds. Please wait..
Enter value for event: gc cr block 3-way
Enter value for sleep: 60
---------|-----------------------|----------------|----------|
Inst id  | Event                 |wait time milli |wait cnt  |
---------|-----------------------|----------------|----------|
1        |gc cr block 3-way      |               1|         3|
1        |gc cr block 3-way      |               2|       320|
1        |gc cr block 3-way      |               4|     17017|
1        |gc cr block 3-way      |               8|     12062|
1        |gc cr block 3-way      |              16|      1725|
1        |gc cr block 3-way      |              32|       113|
1        |gc cr block 3-way      |              64|         0|
1        |gc cr block 3-way      |             128|         0|
1        |gc cr block 3-way      |             256|         0|
1        |gc cr block 3-way      |             512|         0|
1        |gc cr block 3-way      |            1024|         0|
1        |gc cr block 3-way      |            2048|         0|
1        |gc cr block 3-way      |            4096|         0|
1        |gc cr block 3-way      |            8192|         0|
...

In essence, these four scripts will be useful in debugging Global cache performance issues. Please let me know if you encounter errors in these scripts. We will use these scripts in future discussions to show how we resolved few RAC performance issues.

Merry Christmas (Peace On Earth)

Wow, yet another year almost behind us and what looks like being yet another busy year ahead. Can you believe that The Noughties are nearly all gone, I remember being all excited at the dawn of The Eighties !! I must say I’m really looking forward to finally saying “Twenty Ten” rather than the more long-winded “Two [...]

Measuring what matters

Cary Millsap’s recent post prompted me to write down some of the related thoughts in my head.
Here are few of my mantras for systematic troubleshooting and performance tuning, which have materialized in my head over the years of work:
Picking the right starting point to troubleshooting and performance tuning is the most important decision in that process. Pick the wrong starting point and you end up going in circles.

Measuring what matters

Cary Millsap’s recent post prompted me to write down some of the related thoughts in my head.
Here are few of my mantras for systematic troubleshooting and performance tuning, which have materialized in my head over the years of work:
Picking the right starting point to troubleshooting and performance tuning is the most important decision in that process. Pick the wrong starting point and you end up going in circles.

The Core Performance Fundamentals Of Oracle Data Warehousing – Balanced Hardware Configuration

[back to Introduction] If you want to build a house that will stand the test of time, you need to build on a solid foundation. The same goes for architecting computer systems that run databases. If the underlying hardware is not sized appropriately it will likely lead to people blaming software. All too often I [...]

Oracle Database 11g Release 2 for HP-UX Itanium and AIX (PPC64) Now Available

The HP-UX Itanium and AIX (PPC64) ports of Oracle Database 11g Release 2 can now be downloaded from OTN. Happy Holidays!!! Tweet This Post

My Whole System Is Slow. Now What?

At CMG'09 a couple of weeks ago, I presented "Measuring Response Times of Code on Oracle Systems." The paper for this presentation was a subset of "For Developers: Making Friends with the Oracle Database." In the presentation, I spent a few minutes talking about why to measure response times in Oracle, and then I spent a lot of minutes talking about how. As usual, I focused heavily on the importance of measuring response times of individual business tasks executed by individual end users.

Compression to the rescue

We've had issues getting enough storage space allocated on the development server our project is intended to use. For this project, our team doesn't have full DBA rights on the box and have very limited privileges. Even before I joined the team, a request had been made for a little over 3.5 TB of space to be used for the development of a proof of concept datamart. The first project deliverable is due this week and we've only been allocated 200 GB so far. As you can imagine, it's difficult to get needed data loaded when there is less than 10% of the space we need available.

The CPU Costing Model: A Few Thoughts Part III (Bang Bang)

One of the advantages of system statistics and the CPU costing model is in how the CBO deals with multiblock reads and the impact of the db_file_multiblock_read_count parameter. When performing a full table (or fast full index) scan, the CBO needs to determine just how many multiblock read operations are likely to be performed so the associated operation can [...]

TEMP Table Transformation and Remote Procedure Calls

I've recently come across a interesting side-effect regarding temp table transformations at one of my clients.

There was a PL/SQL package procedure that worked fine when called locally but somehow "hung" when being called from a remote database - all it did was to call exactly the same package procedure with the same parameters as the local call, but one of the SQL statements executed as part of the procedure generated an suboptimal execution plan that never completed.

Further investigations revealed that the significant difference between the execution plan of the local and the remote execution of the procedure was the different treatment of a contained "WITH" clause.

The interesting point is that the procedure called itself didn't perform any "distributed" queries or DML - the only difference was that one time the procedure got called locally, and one time remotely per database link. All processing within the procedure was local - no activities using database links were involved.

There are (at least) two known areas where Oracle can optionally use a so called TEMP TABLE TRANSFORMATION as part of the execution plan:

1. Materialization of a Subquery Factoring, also known as "Common Table Expression" or simply "WITH clause"

Oracle uses this when the subquery is used more than once in the execution plan, or if forced with the undocumented MATERIALIZE hint as part of the SELECT in the WITH clause. There are a few (not really documented) limitations of this materialization, in particular if LOBs or LONGs are part of the projection then this TEMP TABLE transformation can't get used.

2. Star transformation with TEMP TABLE transformation

Star transformations can also make use of the TEMP TABLE transformation. This is enabled by default when STAR_TRANSFORMATION_ENABLED is set to TRUE, but can be disabled by setting STAR_TRANSFORMATION_ENABLED to TEMP_DISABLE.

The following testcase shall demonstrate the subtle side effect of the Remote Procedure Call via Database Link.

Consider the following simple package:

create or replace package pkg_test_materialize_remote authid current_user as
procedure test_star_transform_statement;
procedure test_simple_statement;
end pkg_test_materialize_remote;
/

create or replace package body pkg_test_materialize_remote as
procedure test_star_transform_statement as
c sys_refcursor;
n1 number;
c1 varchar2(255);
c2 varchar2(255);
begin
execute immediate 'alter session set star_transformation_enabled = true';

open c for '
select
sum(quantity_sold),
p.prod_subcategory_desc,
c.cust_gender
from
sh.sales s
join sh.products p ON (s.prod_id = p.prod_id)
join sh.customers c ON (s.cust_id = c.cust_id)
join sh.countries d ON (c.country_id = d.country_id)
where
p.prod_subcategory_desc = ''Memory'' and
c.cust_city = ''Oxford'' and
c.cust_gender = ''F''
group by
p.prod_subcategory_desc, c.cust_gender';

fetch c into n1, c1, c2;

close c;
end test_star_transform_statement;

procedure test_simple_statement as
begin
for rec in (
with a as (
select /*+ materialize */
*
from
dual
)
select
*
from
a
where
1 = 2
) loop
null;
end loop;
end test_simple_statement;
end pkg_test_materialize_remote;
/

The simple statement is not really useful but probably the simplest form of requesting a TEMP TABLE transformation.

The statement using star transformation generates a TEMP TABLE transformation when using the default demo SH setup. I've only wrapped it in dynamic SQL to take advantage of any enabled roles due to the AUTHID CURRENT_USER clause.

And indeed when calling the two procedures locally like the following:

exec pkg_test_materialize_remote.test_simple_statement

exec pkg_test_materialize_remote.test_star_transform_statement

You get these execution plans both using TEMP TABLE transformations:

SQL_ID cdudvb233tdau, child number 0
-------------------------------------
WITH A AS ( SELECT /*+ materialize */ * FROM DUAL ) SELECT * FROM A
WHERE 1 = 2

Plan hash value: 1137659336

-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 2 (100)| |
| 1 | TEMP TABLE TRANSFORMATION | | | | | |
| 2 | LOAD AS SELECT | | | | | |
| 3 | TABLE ACCESS FULL | DUAL | 1 | 2 | 2 (0)| 00:00:01 |
|* 4 | FILTER | | | | | |
| 5 | VIEW | | 1 | 2 | 2 (0)| 00:00:01 |
| 6 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6617_E0AEB | 1 | 2 | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - filter(NULL IS NOT NULL)

SQL_ID 5bp3rpfs2f4bw, child number 0
-------------------------------------
select sum(quantity_sold),
p.prod_subcategory_desc, c.cust_gender from
sh.sales s join sh.products p ON (s.prod_id = p.prod_id)
join sh.customers c ON (s.cust_id = c.cust_id) join
sh.countries d ON (c.country_id = d.country_id) where
p.prod_subcategory_desc = 'Memory' and c.cust_city = 'Oxford'
and c.cust_gender = 'F' group by
p.prod_subcategory_desc, c.cust_gender

Plan hash value: 3883759770

------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 435 (100)| | | |
| 1 | TEMP TABLE TRANSFORMATION | | | | | | | |
| 2 | LOAD AS SELECT | | | | | | | |
|* 3 | TABLE ACCESS FULL | CUSTOMERS | 45 | 990 | 405 (1)| 00:00:05 | | |
| 4 | HASH GROUP BY | | 2 | 94 | 30 (10)| 00:00:01 | | |
|* 5 | HASH JOIN | | 27 | 1269 | 24 (9)| 00:00:01 | | |
|* 6 | HASH JOIN | | 27 | 1134 | 22 (5)| 00:00:01 | | |
|* 7 | HASH JOIN | | 27 | 810 | 20 (5)| 00:00:01 | | |
|* 8 | TABLE ACCESS FULL | PRODUCTS | 3 | 54 | 3 (0)| 00:00:01 | | |
| 9 | PARTITION RANGE ALL | | 31 | 372 | 16 (0)| 00:00:01 | 1 | 28 |
| 10 | TABLE ACCESS BY LOCAL INDEX ROWID| SALES | 31 | 372 | 16 (0)| 00:00:01 | 1 | 28 |
| 11 | BITMAP CONVERSION TO ROWIDS | | | | | | | |
| 12 | BITMAP AND | | | | | | | |
| 13 | BITMAP MERGE | | | | | | | |
| 14 | BITMAP KEY ITERATION | | | | | | | |
| 15 | BUFFER SORT | | | | | | | |
|* 16 | TABLE ACCESS FULL | PRODUCTS | 3 | 54 | 3 (0)| 00:00:01 | | |
|* 17 | BITMAP INDEX RANGE SCAN | SALES_PROD_BIX | | | | | 1 | 28 |
| 18 | BITMAP MERGE | | | | | | | |
| 19 | BITMAP KEY ITERATION | | | | | | | |
| 20 | BUFFER SORT | | | | | | | |
| 21 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6616_E0AEB | 45 | 225 | 2 (0)| 00:00:01 | | |
|* 22 | BITMAP INDEX RANGE SCAN | SALES_CUST_BIX | | | | | 1 | 28 |
| 23 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6616_E0AEB | 45 | 540 | 2 (0)| 00:00:01 | | |
| 24 | INDEX FULL SCAN | COUNTRIES_PK | 23 | 115 | 1 (0)| 00:00:01 | | |
------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter(("C"."CUST_CITY"='Oxford' AND "C"."CUST_GENDER"='F'))
5 - access("C1"="D"."COUNTRY_ID")
6 - access("S"."CUST_ID"="C0")
7 - access("S"."PROD_ID"="P"."PROD_ID")
8 - filter("P"."PROD_SUBCATEGORY_DESC"='Memory')
16 - filter("P"."PROD_SUBCATEGORY_DESC"='Memory')
17 - access("S"."PROD_ID"="P"."PROD_ID")
22 - access("S"."CUST_ID"="C0")

Note
-----
- star transformation used for this statement

Let's simulate a Remote Procedure Call using a loopback database link:

create database link loopback@local connect to user identified by pwd using 'orcl';

Now get rid of the two execution plans in the Shared Pool (e.g. by flushing it if a test system) and execute the following:

exec pkg_test_materialize_remote.test_simple_statement@loopback@local

exec pkg_test_materialize_remote.test_star_transform_statement@loopback@local

The execution plans look now like this:

SQL_ID cdudvb233tdau, child number 0
-------------------------------------
WITH A AS ( SELECT /*+ materialize */ * FROM DUAL ) SELECT * FROM A
WHERE 1 = 2

Plan hash value: 3752461848

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 1 (100)| |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL| DUAL | 1 | 2 | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(NULL IS NOT NULL)

SQL_ID 5bp3rpfs2f4bw, child number 0
-------------------------------------
select sum(quantity_sold),
p.prod_subcategory_desc, c.cust_gender from
sh.sales s join sh.products p ON (s.prod_id = p.prod_id)
join sh.customers c ON (s.cust_id = c.cust_id) join
sh.countries d ON (c.country_id = d.country_id) where
p.prod_subcategory_desc = 'Memory' and c.cust_city = 'Oxford'
and c.cust_gender = 'F' group by
p.prod_subcategory_desc, c.cust_gender

Plan hash value: 3619028137

-------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 1238 (100)| | | |
| 1 | SORT GROUP BY NOSORT | | 1 | 57 | 1238 (1)| 00:00:15 | | |
|* 2 | HASH JOIN | | 26 | 1482 | 830 (1)| 00:00:10 | | |
|* 3 | HASH JOIN | | 26 | 1352 | 828 (1)| 00:00:10 | | |
|* 4 | HASH JOIN | | 27 | 810 | 423 (1)| 00:00:06 | | |
|* 5 | TABLE ACCESS FULL | PRODUCTS | 3 | 54 | 3 (0)| 00:00:01 | | |
| 6 | PARTITION RANGE ALL | | 31 | 372 | 419 (1)| 00:00:06 | 1 | 28 |
| 7 | TABLE ACCESS BY LOCAL INDEX ROWID| SALES | 31 | 372 | 419 (1)| 00:00:06 | 1 | 28 |
| 8 | BITMAP CONVERSION TO ROWIDS | | | | | | | |
| 9 | BITMAP AND | | | | | | | |
| 10 | BITMAP MERGE | | | | | | | |
| 11 | BITMAP KEY ITERATION | | | | | | | |
| 12 | BUFFER SORT | | | | | | | |
|* 13 | TABLE ACCESS FULL | PRODUCTS | 3 | 54 | 3 (0)| 00:00:01 | | |
|* 14 | BITMAP INDEX RANGE SCAN | SALES_PROD_BIX | | | | | 1 | 28 |
| 15 | BITMAP MERGE | | | | | | | |
| 16 | BITMAP KEY ITERATION | | | | | | | |
| 17 | BUFFER SORT | | | | | | | |
|* 18 | TABLE ACCESS FULL | CUSTOMERS | 45 | 765 | 405 (1)| 00:00:05 | | |
|* 19 | BITMAP INDEX RANGE SCAN | SALES_CUST_BIX | | | | | 1 | 28 |
|* 20 | TABLE ACCESS FULL | CUSTOMERS | 45 | 990 | 405 (1)| 00:00:05 | | |
| 21 | INDEX FULL SCAN | COUNTRIES_PK | 23 | 115 | 1 (0)| 00:00:01 | | |
-------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("C"."COUNTRY_ID"="D"."COUNTRY_ID")
3 - access("S"."CUST_ID"="C"."CUST_ID")
4 - access("S"."PROD_ID"="P"."PROD_ID")
5 - filter("P"."PROD_SUBCATEGORY_DESC"='Memory')
13 - filter("P"."PROD_SUBCATEGORY_DESC"='Memory')
14 - access("S"."PROD_ID"="P"."PROD_ID")
18 - filter(("C"."CUST_CITY"='Oxford' AND "C"."CUST_GENDER"='F'))
19 - access("S"."CUST_ID"="C"."CUST_ID")
20 - filter(("C"."CUST_CITY"='Oxford' AND "C"."CUST_GENDER"='F'))

Note
-----
- star transformation used for this statement

Notice how the TEMP TABLE TRANSFORMATION is gone from both plans?

The odd thing is that I wasn't able so far to pinpoint any optimizer environment settings that are related to this - in fact the optimizer happily shares the existing plan with the remote or local execution, so there seems not to be any difference in the optimizer environment (and which makes me think that this is more a side-effect than an intended feature).

Therefore it is required to get rid of the plans generated by the local execution to reproduce the issue above.

The behaviour seems to be consistent across 10.2.0.4, 11.1.0.7 and 11.2.0.1.

I could imagine there is something related to the fact that a distributed transaction has been started by the RPC call, but since the plans are re-used when already in the Shared Pool, this doesn't seem to be a reasonable explanation either.

I couldn't find yet any MetaLink (sorry, MOS) documents that describe this particular issue. I might raise an SR if I find the time, the issue has been worked around at the client side by ensuring particular execution paths of the critical statement.

Update Jan 2011: Thanks to the anonymous poster below - the corresponding official bug is Bug 9399589: "WITH" subqueries cannot be materialized inside a global transaction