Search

Top 60 Oracle Blogs

Recent comments

Index out of range

I’ve waxed lyrical in the past about creating suitable column group statistics whenever you drop an index because even when the optimizer doesn’t use an index in its execution path it might have used the number of distinct keys of the index (user_indexes.distinct_keys) in its estimates of cardinality.

I’ve also highlighted various warnings (here (with several follow-on URLs) and here) about when the optimizer declines to use column group statistics. One of those cases is when a predicate on one of the columns goes “out of  range” – i.e. is below the column low_value or above the column high_value. Last night it suddenly crossed my mind that if we drop an index and replace it with a column group we might see an example of inconsistent behaviour: what happens when the index exists but the predicate is out of range – would you find that dropping the index and replacing it with a column group would give you different cardinality estimates for out of range predicates ?

Here’s the demonstration of what happened when I created a simple test on 12.1.0.2:


rem
rem     Script:         index_v_colgrp.sql
rem     Author:         Jonathan Lewis
rem
rem     Last tested
rem             12.1.0.2
rem

create table t1
nologging
as
with generator as (
        select
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum                          id,
        mod(rownum-1,100)               n1,
        mod(rownum-1,100)               n2,
        lpad('x',100,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6 -- > comment to avoid WordPress format problem
;

begin dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );
end;
/

create index t1_i1 on t1(n1, n2);

set autotrace traceonly explain

I’ve created a table with 1M rows, where n1 and n2 are perfectly correlated – there are 100 distinct pairs of values (ranging from (0,0) to (99,99)). Now with autotrace enabled I’ll execute three queries – two with an index on the table of which one will be the baseline plan for predicates that are “in-range” and the other will take the predicates out of range, and the third after I’ve dropped the index and substituted a matching column group to see what I get for the “out of range” plan. The plans may produce different paths as the index disappears, of course, but what we’re only interested in the cardinality estimates in this experiment.

Here’s the code to run the three queries:


select  padding
from    t1
where
        n1 = 50
and     n2 = 50
;

select  padding
from    t1
where
        n1 = 110
and     n2 = 110
;

drop index t1_i1;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for columns (n1, n2) size 1'
        );
end;
/

select  padding
from    t1
where
        n1 = 110
and     n2 = 110
;

And the three execution plans:


--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 10000 |  1044K|  2142   (4)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   | 10000 |  1044K|  2142   (4)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("N1"=50 AND "N2"=50)


---------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |    79 |  8453 |    83   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |    79 |  8453 |    83   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T1_I1 |    79 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("N1"=110 AND "N2"=110)


--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    79 |  8453 |  2142   (4)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |    79 |  8453 |  2142   (4)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("N1"=110 AND "N2"=110)

In summary:

  • With the index in place and the predicates in range the optimizer used user_indexes.distinct_keys to calculate cardinality.
  • With the index in place and the predicates (or just one of them, in fact) out of range the optimizer used the individual column selectivities with linear decay.
  • With a column group instead of an index the optimizer behaved exactly as it used to with the index in place.

So my concern that substituting column groups for indexes was unfounded – the optimizer was being silly (legal disclaimer: that’s just my opinion) with indexes, and the silly (ditto) behaviour with column groups hasn’t changed anything.

I’ll have to go back a couple of versions of Oracle to repeat these tests – maybe this behaviour with user_indexes.distinct_keys in place is relatively recent, but it’s another reason why execution plans may change suddenly and badly as time passes when “nothing changed”.