Search

Top 60 Oracle Blogs

Recent comments

Global Temporary Tables Share Statistics Across Sessions

In another blog posting, I asserted that statistics collected by one session on a Global Temporary table (GTT) would be used by other sessions that reference that table, even though each session has their own physical instance of the table. I thought I should demonstrate that behaviour, so here is a simple test.

We will need two database sessions. I will create a test Global Temporary table with a unique index.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
DROP TABLE t PURGE;
TRUNCATE TABLE t;

CREATE GLOBAL TEMPORARY TABLE t
(a NUMBER,b VARCHAR2(1000))
ON COMMIT PRESERVE ROWS;

CREATE UNIQUE INDEX t ON t(a);

In my first session, I’ll populate the table with 100 rows. The values in column A have the range 1 to 100.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
INSERT INTO t SELECT rownum, RPAD(TO_CHAR(TO_DATE(rownum,'J'),'Jsp') ,500,'.')
FROM dba_objects
WHERE rownum <= 100;

COMMIT;

And I’ll collect statistics on it.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
begin sys.dbms_stats.gather_table_stats(ownname=>user,tabname=>'T'); end;
/

The following settings are just to make the queries easy to read, and so I can use dbms_xplan to generate a plan.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
alter session set nls_date_format = 'hh24:mi:ss dd.mm.yyyy';
alter session set statistics_level = ALL;
set autotrace off pages 40 lines 100
column table_name format a1
column column_name format a1
column low_value format a32
column high_value format a32
column a format 999
column b format a30

So now let’s check the contents of the table. There are 100 rows in the range 1 to 100.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
select count(*), min(a), max(a) from t;

COUNT(*) MIN(A) MAX(A)
---------- ---------- ----------
100 1 100

And the statistics agree with this.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
select table_name, num_rows, last_analyzed
from user_tables
where table_name = 'T';

T NUM_ROWS LAST_ANALYZED
- ---------- -------------------
T 100 18:13:17 13.10.2009

select table_name, column_name, low_value, high_value
from user_tab_columns
where table_name = 'T';

T C LOW_VALUE HIGH_VALUE
- - -------------------------------- --------------------------------
T A C102 C202
T B 45696768742E2E2E2E2E2E2E2E2E2E2E 54776F2E2E2E2E2E2E2E2E2E2E2E2E2E
2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E 2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E

So now let’s try a test query. The database returns 42 rows using a full scan. The statistics in the execution plan1 also predict that there will be 42 rows. Perfectly reasonable.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
select * from t where a<=42;

A B
---- ------------------------------
1 One

42 Forty-Two

42 rows selected.

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------
SQL_ID 6g743z6c2m2ap, child number 0
-------------------------------------
select * from t where a<=42

Plan hash value: 1601196873
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------
|* 1 | TABLE ACCESS FULL| T | 1 | 42 | 4 (0)| 42 |00:00:00.01 | 15 |
-------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("A"<=42)

Now, let’s start a second session, and insert some slightly different data into the same GTT. There are still 100 rows, but this time column A is in the range 43 to 142.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
INSERT INTO t SELECT rownum+42
, RPAD(TO_CHAR(TO_DATE(rownum+42,'J'),'Jsp') ,500,'.')
FROM dba_objects
WHERE rownum <= 100;

COMMIT;

I’ll collect statistics in the same way. First we will check that the data in the table is correct.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
select count(*), min(a), max(a) from t;

COUNT(*) MIN(A) MAX(A)
---------- ---------- ----------
100 43 142

And I can also see that the statistics have changed.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
select table_name, num_rows, last_analyzed
from user_tables
where table_name = 'T';

T NUM_ROWS LAST_ANALYZED
- ---------- -------------------
T 100 18:18:22 13.10.2009

select table_name, column_name, low_value, high_value
from user_tab_columns
where table_name = 'T';

T C LOW_VALUE HIGH_VALUE
- - -------------------------------- --------------------------------
T A C12C C2022B
T B 4569676874792D45696768742E2E2E2E 53697874792E2E2E2E2E2E2E2E2E2E2E
2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E 2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E

If I run the same query, it correctly returns no rows, and uses an index scan to so. The statistics predict one row, but Oracle actually doesn’t find any. Again perfectly reasonable.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
select * from t where a<=42;

no rows selected

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------
SQL_ID 6g743z6c2m2ap, child number 0
-------------------------------------
select * from t where a<=42

Plan hash value: 2795797496

-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------------------------
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 1 | 2 (0)| 0 |00:00:00.01 | 1 |
|* 2 | INDEX RANGE SCAN | T | 1 | 1 | 1 (0)| 0 |00:00:00.01 | 1 |
-----------------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("A"<=42)

But now let’s go back to the first session and run the query again (having flushed the shared pool). Last time we ran it we got 42 rows with a full scan. We still get 42 rows, but now it is using the index range scan, the same execution plan as the other session. In fact, the costs in the execution plan are the same as in the other session. Oracle expected 1 row from the index, but this time it actually got 42. So changing the statistics in the other session has changed the plan in this session, and possibly not for the better. The two sessions are using the same execution plan for different sets of data.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
A B
---- ------------------------------

42 Forty-Two

42 rows selected.

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------
SQL_ID 6g743z6c2m2ap, child number 0
-------------------------------------
select * from t where a<=42

Plan hash value: 2795797496
-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------------------------
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 1 | 2 (0)| 42 |00:00:00.01 | 10 |
|* 2 | INDEX RANGE SCAN | T | 1 | 1 | 1 (0)| 42 |00:00:00.01 | 4 |
-----------------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("A"<=42)

If I now delete stats, I will use optimizer dynamic sampling.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
begin sys.dbms_stats.delete_table_stats(ownname=>user,tabname=>'T'); end;
/

And the plan changes back to a full scan.

(correct version reposted 1.11.2009)

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
A B
---- ------------------------------

42 Forty-Two

42 rows selected.

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------
SQL_ID 6g743z6c2m2ap, child number 0
-------------------------------------
select * from t where a>=42

Plan hash value: 1601196873

-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------
|* 1 | TABLE ACCESS FULL| T | 1 | 42 | 4 (0)| 42 |00:00:00.01 | 15 |
-------------------------------------------------------------------------------------------------
 
 Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("A">=42)

Note
-----
- dynamic sampling used for this statement

Conclusion 

Different instances of the same GTT do share statistics because there is only one location in the data dictionary to store statistics for each table. Therefore, collecting statistics on a GTT in one session will affect other sessions using the same table name, possibly adversely!

(Added 1.11.2009) Optimizer Dynamic Sampling may be a better option for GTTs, and is enabled by default from Oracle 9i, but the behaviour changes slightly in 10g.  However, as Cokan points out in his comments below, if a query on a GTT from one session is still in the shared pool when it is used in a different session, then Oracle will not re-parse the statement, and will not choose a different execution plan.

Footnote 1: The execution plans in this posting have been obtained using dbms_xplan thus:

#eeeeee; border: 0px solid rgb(0, 0, 0); font-family: courier new; overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">select * from table(dbms_xplan.display_cursor(null,null,'IOSTATS +COST'));

I have also flushed the shared pool between each statement; otherwise the numbers are aggregated across multiple executions of the same statement.