Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

ANSI Plans

Here’s a thought that falls somewhere between philosophical and pragmatic. It came up while I was playing around with a problem from the Oracle database forum that was asking about options for rewriting a query with a certain type of predicate. This note isn’t really about that question but the OP supplied a convenient script to demonstrate their requirement and I’ve hi-jacked most of the code for my own purposes so that I can ask the question:

Should the presence of an intermediate view name generated by the optimizer in the course of cost-based query transformation cause two plans, which are otherwise identical and do exactly the same thing, to have different plan hash values ?

To demonstrate the issue let’s start with a simple script to create some data and generate an execution plan.


rem
rem     Script:         or_expand_plans.sql
rem     Author:         Jonathan Lewis
rem     Dated           Oct 2019
rem
rem     Last tested
rem             19.3.0.0
rem             12.2.0.1
rem
rem     Notes:
rem     Data creation copied from ODC
rem     https://community.oracle.com/thread/4297365
rem

create table t as   
select 1 as id, to_date('2019-10-11', 'YYYY-MM-DD') as lastupdated, 'Toronto' as place from dual  
union all  
select 2 as id, to_date('2019-10-12', 'YYYY-MM-DD') as lastupdated, 'Toronto' as place from dual  
union all  
select 3 as id, to_date('2019-10-15', 'YYYY-MM-DD') as lastupdated, 'Toronto' as place from dual  
union all  
select 4 as id, to_date('2019-10-21', 'YYYY-MM-DD') as lastupdated, 'Toronto' as place from dual  
;  

create table t_others as  
select 1 as id, to_date('2019-10-12', 'YYYY-MM-DD') as lastupdated, 'Blue'  as color, 'Zone 7' as zoneid from dual  
union all  
select 2 as id, to_date('2019-10-21', 'YYYY-MM-DD') as lastupdated, 'Red'   as color, 'Zone 7' as zoneid from dual  
union all  
select 3 as id, to_date('2019-10-16', 'YYYY-MM-DD') as lastupdated, 'White' as color, 'Zone 7' as zoneid from dual  
union all  
select 4 as id, to_date('2019-10-17', 'YYYY-MM-DD') as lastupdated, 'Green' as color, 'Zone 7' as zoneid from dual  
;  

create table t_further_info as  
select 1 as id, to_date('2019-10-12', 'YYYY-MM-DD') as lastupdated, 'A' as typeinfo from dual  
union all   
select 2 as id, to_date('2019-10-14', 'YYYY-MM-DD') as lastupdated, 'C' as typeinfo from dual  
union all  
select 3 as id, to_date('2019-10-21', 'YYYY-MM-DD') as lastupdated, 'D' as typeinfo from dual  
union all  
select 4 as id, to_date('2019-10-21', 'YYYY-MM-DD') as lastupdated, 'E' as typeinfo from dual  
;  

prompt  ====================
prompt  "Traditional" syntax
prompt  ====================

explain plan for
select 
        /*+ or_expand */ 
        * 
from
        t,
        t_others        pt,
        t_further_info  fi  
/*
where   (
             t.lastupdated >= to_date('2019-10-21', 'YYYY-MM-DD')   
         or pt.lastupdated >= to_date('2019-10-21', 'YYYY-MM-DD')  
         or fi.lastupdated >= to_date('2019-10-21', 'YYYY-MM-DD')  
        )  
*/
where   to_date('2019-10-21', 'YYYY-MM-DD') <= any(t.lastupdated, pt.lastupdated, fi.lastupdated)   
and     pt.id = t.id  
and     fi.id = t.id  
;

select * from table(dbms_xplan.display(null,null,'outline'));


You’ll see that I have a simple three-table join with the nasty little detail that I have a “non-join” predicates that may require Oracle to check across all three tables before it can decide whether or not a row should be discarded. I’ve shown two variants on a theme – they both have exactly the same effect but the ANY() presentation is just a little bit neater and more compact.

Essentially Oracle can use one of two strategies for this type of query/predicate; the first is to join all three tables and wait until the final join rowsource appears and then apply the check, or it can split the query into a union all of three separate queries where each query drives off a different table selecting only the rows from that table that match “its” part of the predicate.

In the latter case the second and third branches of the union all have to be modified to ensure that they discard any rows already returned by preceding parts of the union all; this can mean lots of new predicates appearing that use the lnnvl() function. (Of course there are variants between these two extremes, but the general principle is unchanged.)

The presence of the (incomplete) /*+ or_expand */ hint in my query is there to tell the optimizer that it should attempt to transform the query into the three-part union all. This, by the way, is a 12c feature, though older versions of Oracle could get similar effects in some cases from the /*+ use_concat */ hint. Here’s the plan, with outline,  I got from 12.2.0.1:


Plan hash value: 3181357500

----------------------------------------------------------------------------------------------
| Id  | Operation               | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                    |     3 |   270 |    17   (6)| 00:00:01 |
|   1 |  VIEW                   | VW_ORE_67EF6547    |     3 |   270 |    17   (6)| 00:00:01 |
|   2 |   UNION-ALL             |                    |       |       |            |          |
|*  3 |    HASH JOIN            |                    |     1 |    55 |     6   (0)| 00:00:01 |
|   4 |     MERGE JOIN CARTESIAN|                    |     4 |   144 |     4   (0)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL  | T_FURTHER_INFO     |     1 |    13 |     2   (0)| 00:00:01 |
|   6 |      BUFFER SORT        |                    |     3 |    69 |     2   (0)| 00:00:01 |
|*  7 |       TABLE ACCESS FULL | T_OTHERS           |     3 |    69 |     2   (0)| 00:00:01 |
|*  8 |     TABLE ACCESS FULL   | T                  |     3 |    57 |     2   (0)| 00:00:01 |
|*  9 |    HASH JOIN            |                    |     2 |   182 |    11  (10)| 00:00:01 |
|  10 |     VIEW                | VW_JF_SET$FB5125FC |     2 |   156 |     9  (12)| 00:00:01 |
|  11 |      UNION-ALL          |                    |       |       |            |          |
|* 12 |       HASH JOIN         |                    |     1 |    42 |     4   (0)| 00:00:01 |
|* 13 |        TABLE ACCESS FULL| T_OTHERS           |     1 |    23 |     2   (0)| 00:00:01 |
|* 14 |        TABLE ACCESS FULL| T                  |     3 |    57 |     2   (0)| 00:00:01 |
|* 15 |       HASH JOIN         |                    |     1 |    42 |     4   (0)| 00:00:01 |
|* 16 |        TABLE ACCESS FULL| T                  |     1 |    19 |     2   (0)| 00:00:01 |
|  17 |        TABLE ACCESS FULL| T_OTHERS           |     4 |    92 |     2   (0)| 00:00:01 |
|  18 |     TABLE ACCESS FULL   | T_FURTHER_INFO     |     4 |    52 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      USE_HASH(@"SEL$7C4216F7" "PT"@"SEL$1")
      LEADING(@"SEL$7C4216F7" "T"@"SEL$1" "PT"@"SEL$1")
      FULL(@"SEL$7C4216F7" "PT"@"SEL$1")
      FULL(@"SEL$7C4216F7" "T"@"SEL$1")
      USE_HASH(@"SEL$A4A33BE0" "T"@"SEL$1")
      LEADING(@"SEL$A4A33BE0" "PT"@"SEL$1" "T"@"SEL$1")
      FULL(@"SEL$A4A33BE0" "T"@"SEL$1")
      FULL(@"SEL$A4A33BE0" "PT"@"SEL$1")
      USE_HASH(@"SET$49E1C21B_3" "T"@"SEL$1")
      USE_MERGE_CARTESIAN(@"SET$49E1C21B_3" "PT"@"SEL$1")
      LEADING(@"SET$49E1C21B_3" "FI"@"SEL$1" "PT"@"SEL$1" "T"@"SEL$1")
      FULL(@"SET$49E1C21B_3" "T"@"SEL$1")
      FULL(@"SET$49E1C21B_3" "PT"@"SEL$1")
      FULL(@"SET$49E1C21B_3" "FI"@"SEL$1")
      USE_HASH(@"SEL$5FCD2D3C" "FI"@"SEL$1")
      LEADING(@"SEL$5FCD2D3C" "VW_JF_SET$FB5125FC"@"SEL$81DF0931" "FI"@"SEL$1")
      FULL(@"SEL$5FCD2D3C" "FI"@"SEL$1")
      NO_ACCESS(@"SEL$5FCD2D3C" "VW_JF_SET$FB5125FC"@"SEL$81DF0931")
      NO_ACCESS(@"SEL$67EF6547" "VW_ORE_67EF6547"@"SEL$67EF6547")
      OUTLINE(@"SET$49E1C21B_2")
      OUTLINE(@"SET$49E1C21B_1")
      OUTLINE(@"SEL$1")
      FACTORIZE_JOIN(@"SET$49E1C21B"("FI"@"SET$49E1C21B_2" "FI"@"SET$49E1C21B_1"))
      OUTLINE(@"SET$0E101D56")
      OUTLINE(@"SEL$81DF0931")
      OUTLINE(@"SEL$5AB42CD1")
      OR_EXPAND(@"SEL$1" (1) (2) (3))
      OUTLINE(@"SET$49E1C21B")
      OUTLINE_LEAF(@"SEL$67EF6547")
      FACTORIZE_JOIN(@"SET$49E1C21B"("FI"@"SET$49E1C21B_2" "FI"@"SET$49E1C21B_1"))
      OUTLINE_LEAF(@"SET$0E101D56")
      OUTLINE_LEAF(@"SEL$5FCD2D3C")
      OUTLINE_LEAF(@"SET$FB5125FC")
      OUTLINE_LEAF(@"SEL$A4A33BE0")
      OUTLINE_LEAF(@"SEL$7C4216F7")
      OUTLINE_LEAF(@"SET$49E1C21B_3")
      ALL_ROWS
      DB_VERSION('12.2.0.1')
      OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("PT"."ID"="T"."ID" AND "FI"."ID"="T"."ID")
   5 - filter("FI"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
   7 - filter(LNNVL("PT"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss')))
   8 - filter(LNNVL("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss')))
   9 - access("FI"."ID"="ITEM_1")
  12 - access("PT"."ID"="T"."ID")
  13 - filter("PT"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
  14 - filter(LNNVL("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss')))
  15 - access("PT"."ID"="T"."ID")
  16 - filter("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))


This is a wonderful demonstration of how brilliant the optimizer can be.  The query has gone through several transformations and two of them have very high visibility. First, you can see the name VW_ORE_67EF6547 at operation 1. This is a view name that Oracle generates to express (cost-based) OR-Expansion” so the optimizer has clearly obeyed my hint. As a consequence of OR-expansion we can also see several examples of the lnnvl() function appearing in the Predicate Information section of the output.; we can also see the hint re-appearing in the completed form of OR_EXPAND(@”SEL$1″ (1) (2) (3)) in the Outline Data.

However, we don’t have the union all of three pieces that we might have expected; we have a union all of two pieces and the second piece is a hash join between the table t_further_info and a view called VW_JF_SET$FB5125FC. This view is the result of “join factorization”. The optimizer has taken the 2nd and 3rd sections of our union all view and decided that it would be cost-effective to “factor out” a common table, so this:

select from t_others, t, t_further_info ... where t_others.date_predicate ...
union all
select from t, t_others, t_further_info ... where t.date_predicate ...

changes to this:

select from 
        (
        select from t_others, t   ... where t_others.date_predicate ...
        union all
        select from t, t_others   ... where t.date_predicate ...
        ),
        t_further_info 
where   ...

Having said all that, I now have to change the code because the original query was written using “ANSI” style joins – like this:


explain plan for
select
        /*+ or_expand */
        * 
from
        t   
inner join 
        t_others       pt  
on      pt.id = t.id  
inner join 
        t_further_info fi  
on      fi.id = t.id  
where
        (
             t.lastupdated >= to_date('2019-10-21', 'YYYY-MM-DD')   
         or pt.lastupdated >= to_date('2019-10-21', 'YYYY-MM-DD')  
         or fi.lastupdated >= to_date('2019-10-21', 'YYYY-MM-DD')  
        )
;

select  * from table(dbms_xplan.display(null,null,'outline'));


In the ANSI example I happen to have used the explicit “OR” list for the date predicates but that’s not really signficant . Here’s the plan produced by this query – and the first thing I’d like you to note is the Plan hash value:

Plan hash value: 3309788271

----------------------------------------------------------------------------------------------
| Id  | Operation               | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                    |     3 |   270 |    17   (6)| 00:00:01 |
|   1 |  VIEW                   | VW_ORE_31069B60    |     3 |   270 |    17   (6)| 00:00:01 |
|   2 |   UNION-ALL             |                    |       |       |            |          |
|*  3 |    HASH JOIN            |                    |     1 |    55 |     6   (0)| 00:00:01 |
|   4 |     MERGE JOIN CARTESIAN|                    |     4 |   144 |     4   (0)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL  | T_FURTHER_INFO     |     1 |    13 |     2   (0)| 00:00:01 |
|   6 |      BUFFER SORT        |                    |     3 |    69 |     2   (0)| 00:00:01 |
|*  7 |       TABLE ACCESS FULL | T_OTHERS           |     3 |    69 |     2   (0)| 00:00:01 |
|*  8 |     TABLE ACCESS FULL   | T                  |     3 |    57 |     2   (0)| 00:00:01 |
|*  9 |    HASH JOIN            |                    |     2 |   182 |    11  (10)| 00:00:01 |
|  10 |     VIEW                | VW_JF_SET$997549B1 |     2 |   156 |     9  (12)| 00:00:01 |
|  11 |      UNION-ALL          |                    |       |       |            |          |
|* 12 |       HASH JOIN         |                    |     1 |    42 |     4   (0)| 00:00:01 |
|* 13 |        TABLE ACCESS FULL| T_OTHERS           |     1 |    23 |     2   (0)| 00:00:01 |
|* 14 |        TABLE ACCESS FULL| T                  |     3 |    57 |     2   (0)| 00:00:01 |
|* 15 |       HASH JOIN         |                    |     1 |    42 |     4   (0)| 00:00:01 |
|* 16 |        TABLE ACCESS FULL| T                  |     1 |    19 |     2   (0)| 00:00:01 |
|  17 |        TABLE ACCESS FULL| T_OTHERS           |     4 |    92 |     2   (0)| 00:00:01 |
|  18 |     TABLE ACCESS FULL   | T_FURTHER_INFO     |     4 |    52 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      USE_HASH(@"SEL$D12FC97A" "PT"@"SEL$1")
      LEADING(@"SEL$D12FC97A" "T"@"SEL$1" "PT"@"SEL$1")
      FULL(@"SEL$D12FC97A" "PT"@"SEL$1")
      FULL(@"SEL$D12FC97A" "T"@"SEL$1")
      USE_HASH(@"SEL$09C9729D" "T"@"SEL$1")
      LEADING(@"SEL$09C9729D" "PT"@"SEL$1" "T"@"SEL$1")
      FULL(@"SEL$09C9729D" "T"@"SEL$1")
      FULL(@"SEL$09C9729D" "PT"@"SEL$1")
      USE_HASH(@"SET$E8D85587_3" "T"@"SEL$1")
      USE_MERGE_CARTESIAN(@"SET$E8D85587_3" "PT"@"SEL$1")
      LEADING(@"SET$E8D85587_3" "FI"@"SEL$2" "PT"@"SEL$1" "T"@"SEL$1")
      FULL(@"SET$E8D85587_3" "T"@"SEL$1")
      FULL(@"SET$E8D85587_3" "PT"@"SEL$1")
      FULL(@"SET$E8D85587_3" "FI"@"SEL$2")
      USE_HASH(@"SEL$95B99BAF" "FI"@"SEL$2")
      LEADING(@"SEL$95B99BAF" "VW_JF_SET$997549B1"@"SEL$BB7F1ECF" "FI"@"SEL$2")
      FULL(@"SEL$95B99BAF" "FI"@"SEL$2")
      NO_ACCESS(@"SEL$95B99BAF" "VW_JF_SET$997549B1"@"SEL$BB7F1ECF")
      NO_ACCESS(@"SEL$31069B60" "VW_ORE_31069B60"@"SEL$31069B60")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$2")
      MERGE(@"SEL$1" >"SEL$2")
      OUTLINE(@"SEL$58A6D7F6")
      OUTLINE(@"SEL$3")
      OUTLINE(@"SET$E8D85587_2")
      OUTLINE(@"SET$E8D85587_1")
      MERGE(@"SEL$58A6D7F6" >"SEL$3")
      OUTLINE(@"SEL$9E43CB6E")
      FACTORIZE_JOIN(@"SET$E8D85587"("FI"@"SET$E8D85587_2" "FI"@"SET$E8D85587_1"))
      OUTLINE(@"SET$6117B24C")
      OUTLINE(@"SEL$BB7F1ECF")
      OUTLINE(@"SEL$344003E3")
      OR_EXPAND(@"SEL$9E43CB6E" (1) (2) (3))
      OUTLINE(@"SET$E8D85587")
      OUTLINE_LEAF(@"SEL$31069B60")
      FACTORIZE_JOIN(@"SET$E8D85587"("FI"@"SET$E8D85587_2" "FI"@"SET$E8D85587_1"))
      OUTLINE_LEAF(@"SET$6117B24C")
      OUTLINE_LEAF(@"SEL$95B99BAF")
      OUTLINE_LEAF(@"SET$997549B1")
      OUTLINE_LEAF(@"SEL$09C9729D")
      OUTLINE_LEAF(@"SEL$D12FC97A")
      OUTLINE_LEAF(@"SET$E8D85587_3")
      ALL_ROWS
      DB_VERSION('12.2.0.1')
      OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("FI"."ID"="T"."ID" AND "PT"."ID"="T"."ID")
   5 - filter("FI"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
   7 - filter(LNNVL("PT"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss')))
   8 - filter(LNNVL("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss')))
   9 - access("FI"."ID"="ITEM_1")
  12 - access("PT"."ID"="T"."ID")
  13 - filter("PT"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
  14 - filter(LNNVL("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss')))
  15 - access("PT"."ID"="T"."ID")
  16 - filter("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))


Is the plan for the “ANSI” version of the query the same as the plan for the “traditional” version? How carefully have you checked – after simply noting that the two Plan hash values were different.

The plans are the same – in that the mechanics are exactly the same and that ought to be the criterion on which we should judge them. But the hash values are different because of the change from traditional to ANSI syntax. The traditional form of the query starts life with a single query block while the ANSI form is considered to be two query blocks, so the initial ANSI query goes through three stages:


1) select from t1 join t2 join t3

2) select /*+ qb_name(sel$2) */ from (select /* qb_name(sel$1) */ from t1, t2), t3

3) select /*+ qb_name(sel$9E43CB6E) */ from t1, t2, t3

So the query is rapidly transformed to the equivalent traditional syntax but we  now have a query block name of SEL$9E43CB6E instead of SEL$1 that the traditional query (in the absence of a /*+ qb_name() */ hint would have had. This is why you see the difference in the two or_expand() hints in the Outline Data section. One reads: OR_EXPAND(@SEL$1 (1) (2) (3)), the other reads OR_EXPAND(@”SEL$9E43CB6E” (1) (2) (3)), and all the subseqent query block name differences follow on from this initial transformation. (Note: the value “sel$9e43cb6e” is derived from the input query block names of sel$1 and sel$2 that the new query block is derived from)

You may decide that this mismatch isn’t really terribly important. If you’ve modified the code to switch to ANSI style joins then you may be prepared to put in a little extra effort to check the plan in detail to see that it hasn’t changed; but it would be a lot nicer if the hash value wasn’t dependent on generated view names. You may recall that at one time Oracle had problems with plans that used materialized CTEs (“with” subqueries) because the plan hash value was affected by object names like sys_temp_0fd9d6791_dfc12da. The same principle ought, I think, to apply here.

If you don’t mind the ANSI/tradiational switch though, you might find that you’re less happy when you upgrade to 19c, because the same effect appears there too, only it’s worse. Not only do “identical” traditional and ANSI plans have different hash values, they don’t match the values from 12c because the generated name for the join factorization views (VW_JF) change in the upgrade. So if you’re depending on SQL Plan Baselines to reproduce 12c plans on 19c when you upgrade you may find cases where you know the stored baseline is giving you the same plan but Oracle thinks it isn’t and refuses to use it.

tl;dr

Plans which are functionally identical can have different plan hash values because the plans were reached through a different series of tranformations. In particular if you rewrite queries from “traditional” Oracle syntax to “ANSI” syntax you will find cases where the plan doesn’t change but the plan hash value does thanks to a change in the names of views generated by some transformations.

More significantly, if you upgrade from 12c to 19c there are case where the names of views generated by transformations may change, which could cause the optimizer to discard some of your carefully constructed SQL Plan Baselines as “not reproducible”.

Footnote

For reference, if I add the hint /*+ opt_param(‘_optimizer_join_factorization’ ‘false’) */ to the “traditional query then I get the following plan which shows more clearly the three branches that the original query has been split into – each section starting with a different choice for the driving table:


-------------------------------------------------------------------------------------------
| Id  | Operation               | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                 |     3 |   270 |    19   (6)| 00:00:01 |
|   1 |  VIEW                   | VW_ORE_67EF6547 |     3 |   270 |    19   (6)| 00:00:01 |
|   2 |   UNION-ALL             |                 |       |       |            |          |
|*  3 |    HASH JOIN            |                 |     1 |    55 |     7  (15)| 00:00:01 |
|*  4 |     HASH JOIN           |                 |     1 |    32 |     4   (0)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL  | T               |     1 |    19 |     2   (0)| 00:00:01 |
|   6 |      TABLE ACCESS FULL  | T_FURTHER_INFO  |     4 |    52 |     2   (0)| 00:00:01 |
|   7 |     TABLE ACCESS FULL   | T_OTHERS        |     4 |    92 |     2   (0)| 00:00:01 |
|*  8 |    HASH JOIN            |                 |     1 |    55 |     6   (0)| 00:00:01 |
|   9 |     MERGE JOIN CARTESIAN|                 |     4 |   144 |     4   (0)| 00:00:01 |
|* 10 |      TABLE ACCESS FULL  | T_OTHERS        |     1 |    23 |     2   (0)| 00:00:01 |
|  11 |      BUFFER SORT        |                 |     4 |    52 |     2   (0)| 00:00:01 |
|  12 |       TABLE ACCESS FULL | T_FURTHER_INFO  |     4 |    52 |     2   (0)| 00:00:01 |
|* 13 |     TABLE ACCESS FULL   | T               |     3 |    57 |     2   (0)| 00:00:01 |
|* 14 |    HASH JOIN            |                 |     1 |    55 |     6   (0)| 00:00:01 |
|  15 |     MERGE JOIN CARTESIAN|                 |     4 |   144 |     4   (0)| 00:00:01 |
|* 16 |      TABLE ACCESS FULL  | T_FURTHER_INFO  |     1 |    13 |     2   (0)| 00:00:01 |
|  17 |      BUFFER SORT        |                 |     3 |    69 |     2   (0)| 00:00:01 |
|* 18 |       TABLE ACCESS FULL | T_OTHERS        |     3 |    69 |     2   (0)| 00:00:01 |
|* 19 |     TABLE ACCESS FULL   | T               |     3 |    57 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      USE_HASH(@"SET$BB614FD2_3" "T"@"SET$BB614FD2_3")
      USE_MERGE_CARTESIAN(@"SET$BB614FD2_3" "PT"@"SET$BB614FD2_3")
      LEADING(@"SET$BB614FD2_3" "FI"@"SET$BB614FD2_3" "PT"@"SET$BB614FD2_3"
              "T"@"SET$BB614FD2_3")
      FULL(@"SET$BB614FD2_3" "T"@"SET$BB614FD2_3")
      FULL(@"SET$BB614FD2_3" "PT"@"SET$BB614FD2_3")
      FULL(@"SET$BB614FD2_3" "FI"@"SET$BB614FD2_3")
      USE_HASH(@"SET$BB614FD2_2" "T"@"SET$BB614FD2_2")
      USE_MERGE_CARTESIAN(@"SET$BB614FD2_2" "FI"@"SET$BB614FD2_2")
      LEADING(@"SET$BB614FD2_2" "PT"@"SET$BB614FD2_2" "FI"@"SET$BB614FD2_2"
              "T"@"SET$BB614FD2_2")
      FULL(@"SET$BB614FD2_2" "T"@"SET$BB614FD2_2")
      FULL(@"SET$BB614FD2_2" "FI"@"SET$BB614FD2_2")
      FULL(@"SET$BB614FD2_2" "PT"@"SET$BB614FD2_2")
      USE_HASH(@"SET$BB614FD2_1" "PT"@"SET$BB614FD2_1")
      USE_HASH(@"SET$BB614FD2_1" "FI"@"SET$BB614FD2_1")
      LEADING(@"SET$BB614FD2_1" "T"@"SET$BB614FD2_1" "FI"@"SET$BB614FD2_1"
              "PT"@"SET$BB614FD2_1")
      FULL(@"SET$BB614FD2_1" "PT"@"SET$BB614FD2_1")
      FULL(@"SET$BB614FD2_1" "FI"@"SET$BB614FD2_1")
      FULL(@"SET$BB614FD2_1" "T"@"SET$BB614FD2_1")
      NO_ACCESS(@"SEL$49E1C21B" "VW_ORE_67EF6547"@"SEL$67EF6547")
      OUTLINE(@"SEL$1")
      OR_EXPAND(@"SEL$1" (1) (2) (3))
      OUTLINE_LEAF(@"SEL$49E1C21B")
      OUTLINE_LEAF(@"SET$BB614FD2")
      OUTLINE_LEAF(@"SET$BB614FD2_1")
      OUTLINE_LEAF(@"SET$BB614FD2_2")
      OUTLINE_LEAF(@"SET$BB614FD2_3")
      ALL_ROWS
      OPT_PARAM('_optimizer_join_factorization' 'false')
      DB_VERSION('19.1.0')
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("PT"."ID"="T"."ID")
   4 - access("FI"."ID"="T"."ID")
   5 - filter("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
   8 - access("PT"."ID"="T"."ID" AND "FI"."ID"="T"."ID")
  10 - filter("PT"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
  13 - filter(LNNVL("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00',
              'syyyy-mm-dd hh24:mi:ss')))
  14 - access("PT"."ID"="T"."ID" AND "FI"."ID"="T"."ID")
  16 - filter("FI"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
  18 - filter(LNNVL("PT"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00',
              'syyyy-mm-dd hh24:mi:ss')))
  19 - filter(LNNVL("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00',
              'syyyy-mm-dd hh24:mi:ss')))

88 rows selected.

Although the “traditional” and “ANSI” plans still show a difference in their Plan hash values when join factorization is blocked, the absence of the join factorization view means that the plan hash values are now consistent between 12c to 19c (the output above came from 19.3.0.0 as you can see in the Outline information).

Video : Oracle REST Data Services (ORDS) : OAuth Client Credentials Authorization

Today’s video is a zip through the OAuth Client Credentials Authorization flow in Oracle REST Data Services (ORDS).

For those of you that are afraid of videos, this is one of the authentication and authorization methods discussed in this article.

You can get more information about ORDS here.

The star of today’s video is Øyvind Isene, who is trying to act all cool about being in one of my videos, when in fact I’ve made all his dreams come true. </p />
</p></div>

    	  	<div class=

PostgreSQL subtransactions, savepoints, and exception blocks

TL;DR: similar syntax but very different transaction semantic between Oracle and PostgreSQL procedural blocks

I posted a tricky Quiz on Twitter (unfortunately forgot to mention explicitely that I have a unique constraint on DEMO1.N):

The trick is that I didn’t precise on which database I run that. And I used on purpose a syntax that is valid both for Oracle (with the anonymous block in PL/SQL) and PostgreSQL (with the anonymous block in PL/pgSQL).

A compatible syntax does not mean that the semantic is the same. That’s the common issue with people who think that it is easy to port a database application or build a database-agnostic application. You can speak the same language without understanding the same meaning. The specifications for each implementation goes beyond the apparent standard syntax.

Exception block with Oracle

https://dbfiddle.uk/?rdbms=oracle_18&fiddle=f0f4bf1c0e2e91c210e815d2ac67a688

db<>fiddle — Oracle 19c: https://dbfiddle.uk/?rdbms=oracle_18&fiddle=f0f4bf1c0e2e91c210e815d2ac67a688

The PL/SQL block runs within an existing transaction and the exception block has nothing to do with the transaction control. This is only about branching to another code path when an exception occurs.

Then, what was previously inserted is still visible in the transaction, and can be committed or rolled back.

Exception block in Postgres

https://dbfiddle.uk/?rdbms=postgres_12&fiddle=110d82eff25dde2823ff17b4fe9157d9

db<>fiddle — PostgreSQL 12: https://dbfiddle.uk/?rdbms=postgres_12&fiddle=110d82eff25dde2823ff17b4fe9157d9

Here, the PL/pgSQL block runs as an atomic subtransaction. And when an exception is trapped, the whole block is rolled-back before executing the exception block. Actually, the block that has an exception handler is run in a “subtransaction” which is nothing else than setting a savepoint at the begin and rollback to this savepoint when entering the exception block.

This, of course, is documented:

When an error is caught by an EXCEPTION clause, the local variables of the PL/pgSQL function remain as they were when the error occurred, but all changes to persistent database state within the block are rolled back.

In those examples, the exception handler did not raise any error. If I re-raise the error in the exception block, the be behavior is the same between Oracle and PostgreSQL: all changes done by the block (including the exception block) are rolled back.

Re-raise In PostgreSQL:

do $$
begin
insert into DEMO1 (n) values (1);
insert into DEMO1 (n) values (42);
exception when others then
insert into DEMO2 select * from DEMO1;
raise;
end;
$$ language plpgsql;

ERROR: duplicate key value violates unique constraint "demo1_n_key"
DETAIL: Key (n)=(42) already exists.
CONTEXT: SQL statement "insert into DEMO1 (n) values (42)"
PL/pgSQL function inline_code_block line 4 at SQL statement
select * from DEMO2;
n
---
(0 rows)

Re-raise in Oracle:

begin
insert into DEMO1 (n) values (1);
insert into DEMO1 (n) values (42);
exception when others then
insert into DEMO2 select * from DEMO1;
raise;
end;/
ORA-00001: unique constraint (DEMO.SYS_C0093607) violated
ORA-06512: at line 6
ORA-06512: at line 3
select * from DEMO2;
no rows selected.

More info about this behavior in Oracle from Stew Ashton:

Statement-Level Atomicity

Basically, in Oracle, the call to the stored procedure follows statement-level atomicity where an internal savepoint is set before any statement (SQL or PL/SQL) and an unhandled exception (including a re-raise exception) rolls back to it. That’s different in PostgreSQL where no savepoint is set implicitly, and the session has to rollback the whole transaction when an error occurs. The savepoint set before the PL/pgSQL block is only to rollback changes before executing the exception block.

Postgres transaction control and exception blocks

But, then what happens if we commit within the code block? It is then impossible to ensure that “all changes to persistent database state within the block are rolled back” because what is committed (made visible to others) cannot be rolled-back. And that’s the main goal of intermediate commits.

This impossibility is implemented with “ ERROR: cannot commit while a subtransaction is active” in spi.c:

https://github.com/postgres/postgres/blob/master/src/backend/executor/spi.c#L220

and all this is, of course, documented with a small statement in https://www.postgresql.org/docs/current/plpgsql-transactions.html:

A transaction cannot be ended inside a block with exception handlers.

The specifications for it is also mentioned in the “Transaction control in procedures” hackers thread started by Peter Eisentraut when proposing this feature:

Re: [HACKERS] Transaction control in procedures

Limitations

As I understand it, this restriction is there to keep the semantics of the subtransaction when an exception block is present. With a savepoint at BEGIN and a rollback to savepoint at EXCEPTION. This semantic specification predates the introduction of transaction control in procedures. However, new requirements to take full advantage of the transaction control in procedures have been raised by Bryn Llewellyn (currently YugaByteDB developer advocate, former Oracle PL/SQL product manager): https://github.com/yugabyte/yugabyte-db/issues/2464

These use-cases are about encapsulating the database calls in stored procedures that, then, expose only the microservice API. For security, performance, and portability this API must be database-agnostic, and then:

  • all RDBMS-specific error messages must be trapped and translated to business messages and/or system logs. This must be done in an exception block that also covers the commit-as the commit can fail.
  • serialization errors at commit must be re-tried on the server, and that must be done also with an exception block that covers the commit.

Another reason to commit in a procedure is during a large bulk operation where we want intermediate commits. We may want to trap exceptions in this case as well and to retry some operations in case of errors.

If I try to code the commit and the exception, the “cannot commit while a subtransaction is active” error is raised as soon as the “commit” statement is encountered, before even trying to execute it:

create table DEMO(
n integer primary key deferrable initially deferred
);
create or replace procedure my_test(n int) as $$
begin
insert into DEMO(n) values(n);
commit;
exception when others then
raise notice '%',sqlerrm;
end;
$$ language plpgsql;
CREATE PROCEDURE
call my_test(1);
NOTICE: cannot commit while a subtransaction is active
CALL

If I remove the commit, I can catch the exceptions, but then I must handle the commit error in the client:

create or replace procedure my_test(n int) as $$
begin
insert into DEMO(n) values(n);
--commit;
exception when others then
raise notice '%',sqlerrm;
end;
$$ language plpgsql;
CREATE PROCEDURE
call my_test(1);
CALL
call my_test(1);
ERROR: duplicate key value violates unique constraint "demo_pkey"
DETAIL: Key (n)=(1) already exists.

Here the error message does not come from the exception block, but from the end of the command, because I am in autocommit mode. This is more visible from an explicit transaction:

begin transaction;
BEGIN
call my_test(1);
CALL
commit;
ERROR: duplicate key value violates unique constraint "demo_pkey"
DETAIL: Key (n)=(1) already exists.

Evolution

I think that the “A transaction cannot be ended inside a block with exception handlers” specification should be adapted to procedures. In my opinion, a commit should be allowed, ending the subtransaction and starting a new one. What was committed will never be rolled back. When an exception is raised, only the changes since the last commit should be rolled back.

Discussion about this should probably go in this hackers thread:

Re: PL/pgSQL - "commit" illegal in the executable section of a block statement that has an exception section

CBO Oddities – 1

I’ve decided to do a little rewriting and collating so that I can catalogue related ideas in an order that makes for a better narrative. So this is the first in a series of notes designed to help you understand why the optimizer has made a particular choice and why that choice is (from your perspective) a bad one, and what you can do either to help the optimizer find a better plan, or subvert the optimizer and force a better plan.

If you’re wondering why I choose to differentiate between “help the optimizer” and “subvert the optimizer” consider the following examples.

  • A query is joining two tables in the wrong order with a hash join when you know that a nested loop join in the opposite order would far better because you know that the data you want is very nicely clustered and there’s a really good index that would make access to that data very efficient. You check the table preferences and discover that the table_cached_blocks preference (see end notes) is at its default value of 1, so you set it to 16 and gather fresh stats on the indexes on the table. Oracle now recognises the effectiveness of this index and changes plan accordingly.
  • The optimizer has done a surprising transformation of a query, aggregating a table before joining to a couple of other tables when you were expecting it to use the joins to eliminate a huge fraction of the data before aggregating it.  After a little investigation you find that setting hidden parameter _optimizer_distinct_placement to false stops this happening.

You may find the distinction unnecessarily fussy, but I’d call the first example “helping the optimzier” – it gives the optimizer some truthful information about your data that is potentially going to result in better decisions in many different statements – and the second example “subverting the optimizer” – you’ve brute-forced it into not taking a path you didn’t like but at the same time you may have stopped that feature from appearing in other ways or in other queries. Of course, you might have minimised the impact of setting the parameter by using the opt_param() hint to apply the restriction to just this one query, nevertheless it’s possible that there is a better plan for the query that would have used the feature at some other point in the query if you’d managed to do something to help the optimizer rather than constraining it.

What’s up with the Optimizer

It’s likely that most of the articles will be based around interpreting execution plans since those are the things that tell us what the optimizer thinks will happen when it executes a statement, and within execution plans there are three critical aspects to consider –

  1. the numbers (most particularly Cost and Rows),
  2. the shape of the plan,
  3. the Predicate Information.

I want to use this note to make a couple of points about just the first of the three.

  • First – the estimates on any one line of an execution plan are “per start” of the line; some lines of an execution plan will be called many times in the course of a statement. In many cases the Rows estimate from one line of a plan will dictate the number of times that some other line of the plan will be executed – so a bad estimate of “how much data” can double up as a bad estimate of “how many times”, leading to a plan that looks efficient on paper but does far too much work at run-time. A line in a plan that looks a little inefficient may be fine if it executes only one, a line that looks very efficient may be a disaster if it executes a million time. Being able to read a plan and spot the places where the optimizer has produced a poor estimate of Rows is a critical skill – and there are many reasons why the optimizer produces poor estimates. Being able to spot poor estimates depends fairly heavily on knowing the data, but if you know the generic reasons for the optimizer producing poor estimates you’ve got a head start for recognising and addressing the errors when they appear.
  • Second – Cost is synonymous with Time. For a given instance at a given moment there is a simple, linear, relationship between the figure that the optimizer reports for the Cost of a statement (or subsection of a statement) and the Time that the optimizer reports. For many systems (those that have not run the calibrate_io procedure) the Time is simply the Cost multiplied by the time the optimizer thinks it will take to satisfy a single block read request, and the Cost is the optimizer’s estimate of the I/O requirement to satisfy the statement – with a fudge factor introduced to recognise the fact that a “single block” read request ought to complete in less time than a “multiblock” read request. Generally speaking the optimizer will consider many possible plans for a statement and pick the plan with the lowest estimated cost – but there is at least one exception to this rule, and it is an unfortunate weakness in the optimizer that there are many valid reasons why its estimates of Cost/Time are poor. Of course, you will note that the values that Oracle reports for the Time column are only accurate to the second – which isn’t particularly helpful when a single block read typically operates in the range of a few milliseconds.

To a large degree the optimizer’s task boils down to:

  • What’s the volume and scatter of the data I need
  • What access paths, with what wastage, are available to get to that data
  • How much time will I spend on I/O reading (and possibly discarding) data to extract the bit I want

Of course there are other considerations like the amount of CPU needed for a sort, the potential for I/O as sorts or hash joins, the time to handle a round-trip to a remote system, and RAC variations on the basic theme. But for many statements the driving issue is that any bad estimates of “how much data” and “how much (real) I/O” will lead to bad, potentially catastrophic, choices of execution plan. In the next article I’ll list all the different reasons (that I can think of at the time) why the optimizer can produce bad estimates of volume and time.

References for Cost vs. Time

References for table_cached_blocks:

 

Friday Philosophy – Top Ten Influencers in my Technical Career

Earlier this week I was sipping wine late at night and cogitating on what led me to where I am now. Part of that was the more specific topic of what, or rather who, influenced my technical development and career, especially early on. As a little game for myself, I decided to jot down the 10 first names I came up with and did not discard within 5 seconds. And then excluded those who’s influence had been negative!

It’s OK, don’t get your hopes up, you are not on the list.

That list was:

  • Cary Millsap
  • Craig Shallahamer
  • Mike Cox
  • Jonathan Lewis
  • Barry
  • Maria Colgan
  • Steven Feuerstein
  • Rachael Carmichael
  • Tim (OracleBase) Hall
  • Dominic Giles.
  • Richard Foote

I really hope you have heard of most of them. I’d be amazed if you know all of them. And yes, there are 11. I was, if you remember, sipping wine late at night. In the morning I looked at the list and thought about refining it or expanding it (and boy, I could expand it to 50 people plus in 10 minutes) but that was not my game. First 10, with very little analysis.

You know what is coming, I’m going to step through the list. I’m so obvious. But the reasons are not always so obvious (though some are, sorry). Remember, I was slightly drunk.

Cary Millsap. I detest Cary Millsap.

I’m joking of course! But a tiny little bit of me always goes “Grrrr” when I think of this man who is fundamentally a really nice person, very bright, and a wonderful presenter. Why? Well, he came up with OFA, the Optimal Flexible Architecture. This specified a logical, sensible way to lay out an Oracle database, it’s files and the directories they went in, file naming conventions etc such that the database was performant, easy to navigate, and you could have multiple databases on a server without confusion. And that could have been me! I came up with almost the exact same thing and I was damn proud of it. And 6 months after I came up with it and thought “I need to make a big thing of this and get some credit for my genius” – I came across OFA. I was gutted.

The thing is, Cary was one of the first people I came across who was putting out stuff to help all us DBA types back in the 1990’s.  I am sure I must have seen stuff he did that became the OFA and it influenced me. His OFA was first published a couple of years before I came up with my design, but I had not seen it. We did not really have the internet back then!

Cary did not influence me simply by producing great stuff, he made me realise that several people can come up with similar ideas and, actually, being “first” is nice – but really the key thing is to spread the knowledge. Making our jobs easier for everyone around you is really doing something for the community. Cary also came up with Method R for performance tuning which is great, but time to move on.

I sometimes mention I have a decent dose of dyslexia. In my mind Craig is Craig “Shalamar”. His last name is too long for me and I “spin” in the middle of his surname “Shallahamer”. Too many repeated letters (in my mind there are 2 m’s too). Thus when I only knew him from printed materials my brain would stick with the wrong name. Few people were putting out stuff in the early 90’s and because his stuff was so good he was a key, early source of received wisdom for me. Then in the late 90’s he disappeared, or at least from my view he did. But now he’s back and I’ve met him. He is about the only person (him and Kerry Osbourne, sorry Kerry) who I have been a little hem-touchy with  (go right to the end of that post). ie went “Oh wow! You are blah blah!” when meeting them (follow the link if you want to know what I mean). It’s OK, Craig let me off. I got him a beer. It was a free beer, it was at DOAG! One day I’ll actually buy him a beer to say thank you for all the help he gave me early on. I might even buy him two, but let’s not get too giddy.

Mike Cox is fundamentally a brilliant developer & incredibly smart and he will never, ever present. It’s not for him. He represents the huge number of very talented I.T people you never hear about as they just get on with the job. I worked with Mike when I was at Oracle in the early 90’s and again at the end of the 90’s when he {grudgingly} told his boss I was not totally useless. His boss knew that was high praise. I remember someone telling Mike his code did not work. Mike replied “Yes it does! I’ve checked it. Twice!”. His code worked. He is one of the few people I know who can write a page of PL/SQL and execute it and it does what he wants, first execution. But that is not what he taught me. He taught me that what we do is create solutions and the absolute one thing you have to do as a developer is create something the user wants to use. I.E. it makes their working life easier. Everything else is secondary. Thanks Mike.

If you are in the technical core Oracle RDBMS sphere and you do not know who Jonathan Lewis is, I’m stunned. His approach to methodically understanding problems and how Oracle works is second to none. I think there are one or two people as good as Jonathan is but personally I know of no one better. So that is why he influenced me? Well, yes and no. Oracle Names, those top people (and this is true in all disciplines) are people, just like all of us. Very talented but, fundamentally, normal people. Jonathan is a friend, I like chatting to him in the pub and we will discuss bread and chainsaws on twitter. And he has given me advice and help over the years, as a friend, and I very much appreciate that. And if it is not Oracle, sometimes I’m teaching him. If you meet those presenters and writers of good stuff then yes, of course, respect their skill. But don’t hero worship them. Most of them don’t actually like it. Treat them like regular people (because they ARE regular people) and you might make a friend.

I’ve written about Barry before (and no, I can’t for the life of me remember his last name). Barry taught me that you don’t need to be technically the best to be great at what you do. You need to care and you need to be willing to try and you need to be willing to learn. It’s all about attitude. In the little team we were in we had a guy who was technically superb. And no one liked him, as he was an arrogant and unhelpful bugger. Everyone liked Barry and asked him to help. Be like Barry. I try to be like Barry.

SQL Maria (She’ll probably never lose that nick name in the Oracle sphere) used to the product manager for the optimizer and I was a performance nerd, so of course I knew of Maria Colgan. The number of times she said to the audience “I’m not technical, I don’t understand this stuff…” and then gave a really good description of that stuff. She was a little liar! She knew her stuff (and still does), you can’t present like that and not know your topic. She was also one of the first product managers in Oracle I started chatting to, both about technical topics and as a friendly face. Oracle Names are just normal people and Oracle Names From Oracle are just normal people too. Who knew? Maria now looks after In Memory and stuff like that, but if you google her, the top hit is still “Maria Colgan Oracle Optimizer”. I wonder if Nigel Bayliss, who has been the PM for the optimizer for a few years now (and very good he is at it too) has a doll in a drawer with pins in it…

Steven Feurestein. I can’t spell his last name best out of three due to the aforementioned dyslexia. Anyone, and I mean ANYone, who was coding in PL/SQL in the mid 90’s onward probably had/has the Ant Book on their desk, Oracle PL/SQL Programming by Steven. I consumed the first edition of that book, pretty much working it to ruin as I referred to it over the years. I still have it and boy it is tatty. Thanks for that book Steven, and the ones that came after it. However, Steven has influenced me twice. He now works for Oracle, leading the Oracle Developer Advocates team which includes the Ask Tom team. And that’s sort of what I do now, advocate Oracle and the community. Only I don’t really get paid for it. Can I have a job Steven?

{Why did I not pick Tom Kyte? Looking back now he was a massive influence on me as he was on many others, he should be in the list. But he isn’t. So aren’t a lot of excellent people like Arup Nanda, Chris Antognini, Kevin Closson, Uwe Hess…}

I thought I had written a blog about Rachael Carmichael but it seems I have not. Rachel was really active in the Oracle presenting circuit back in the 90’s and early 2000’s and wrote/contributed to several books. I met her at one of my first UKOUG conferences when I was a presenting newbie. Rachael sort of took me under her wing and not only gave me good advice but also introduced me to several of the really well know presenters, a lot of who were in the Oak Table. Both of those things had a big influence on my career.

Rachael then decided she’d had enough of technology and followed a different path and swapped to working with animals. Because she wanted to. You can change career totally – if the current one has lost it’s sparkle, go find something else to do. I did not leave the Oracle sphere (I thought about it) but I decided to stop being mostly a technician and more an enabler, encouraged by Rachael’s example.

 

ORACLE_BASE must be one of the most visited and highest quality sources of Oracle technical information on the web. If you did not know, Tim Hall writes it all (I think he writes it all. Maybe he has a team held captive in his basement. I’ll ask him). If I need to check syntax or how a feature works, I google it and if an ORACLE-BASE page comes up I go there. Tim’s a great guy and a very good presenter – but don’t let him near an Oracle panel session. And oh boy don’t let him sit on one! Like me he is a biologist really, so an absolute top, fantastic bloke :-). Tim also has a very philosophical outlook on this Oracle technology bollocks, which I am sure encouraged me to do my Friday Philosophies.

Dominic Giles is a Master Product Manager for the Oracle Database here in the UK. I don’t know what you do to become a Master product manager, maybe just get old? For years Dom has been a real friend to the UKOUG and the conference circuit in general, doing great talks about the core RDBMS, what is new, what can and cannot be done. But the reason he really influenced me is he came to help us when I was working on the human genome project. Most consultants going on-site for a company would never tell the client to “just grow a pair and do it Martin”. Dom did. Bloody brilliant. We did know each other quite well at this point and it was a kick up the arse I needed. Be real with people, it’s so much more effective (if perhaps a little more risky?)

https://mwidlake.files.wordpress.com/2019/10/bowie.jpg?w=600&h=376 600w, https://mwidlake.files.wordpress.com/2019/10/bowie.jpg?w=150&h=94 150w" sizes="(max-width: 300px) 100vw, 300px" />


Unfortunately, Richard does not look like this anymore

Finally, and well done for reading this far, is Richard Foote. Actually, I reckon almost no one will have got through this whole list, my wife keeps telling me to split this post into 2 or 3 parts. But Richard will get this far, he wants to know what I say about him and if it includes anything about David Bowie. Richard is a bit of a Bowie fan, as am I. Bowie’s “Black Tie, White Noise” is playing as I type this. What Richard does not know about indexing you don’t need to know. I learnt a lot from him. But then I learnt a lot from many people, so why Richard?
This blog. I stole his layout for mine. In fact, before I changed the graphics for the banner and stretched the format it looked EXACTLY like Richard’s blog. Also, I liked Richard’s presenting style – Relaxed, jokey, but with good technical content. I sort of nicked that too. Part of me just want to be Richard, except for the being Australian bit </p />
</p></div>

    	  	<div class=

In Defence of Best Practices

https://oracle-base.com/blog/wp-content/uploads/2019/10/bird-159922_640-... 204w" sizes="(max-width: 174px) 85vw, 174px" />

The subject of “Best Practices” came up again yesterday in a thread on Twitter. This is a subject that rears its head every so often.

I understand all the arguments against the term “Best Practices”. There isn’t one correct way to do things. If there were it would be the only way, or automatic etc. It’s all situational etc. I really do understand all that. I’ve been in this conversation so many times over the years you wouldn’t believe it. I’ve heard all the various sentences and terms people would prefer to use rather than “Best Practice”, but here’s my answer to all that.

“Best practices are fine. Get over yourself and shut up!”

Tim Hall : 18th October 2019

I’ve said this more politely in many other conversations, including endless email chains etc.

When it comes down to it, people need guidance. A good best practice will give some context to suggest it is a starting point, and will give people directions for further information/investigation, but it’s targeted at people who don’t know enough about what they are doing and need help. Without a best practice they will do something really bad, and when shit happens they will blame the product. A good best practice can be the start of a journey for people.

I agree that the “Always do this because ‘just bloody do it!'” style of best practice is bad, but we all know that…

I just find the whole conversation so elitist. I spend half of my life Googling solutions (mostly non-Oracle stuff) and reading best practices and some of them are really good. Some of them have definitely improved my understanding, and left me in a position where I have a working production system that would otherwise not be working.

I’m sure this post will get a lot of reactions where people try and “explain to me” why I am wrong, and what I’m not understanding about the problems with best practices. As mentioned before, I really do know all that and I think you are wrong, and so do the vast majority of people outside your elitist echo chamber. Want to test that? Try these…

  • Write a post called “Best Practices for {insert subject of your choice}”. It will get more hits than anything else you’ve ever written.
  • Submit a conference session called “Best Practices for {insert subject of your choice}”. Assuming it gets through the paper selection, you will have more bums on seats than you’ve ever had before for that same subject.

Rather than wasting your life arguing about how flawed the term “Best Practices” is, why don’t you just write some good best practices? Show the world how they should be done, and start people on a positive journey. It’s just a term. Seriously. Get over yourself!

Cheers

Tim…

PS. I hope people from yesterday’s tweets don’t think this is directed at them. It’s really not. It’s the subject matter! This really is a subject I’ve revisited so many times over the years…

Updates

Due to repeatedly having to explain myself, here come some points people have raised and my reactions. I’m sure this list will grow as people insist on “educating me” about why I’m wrong.

I prefer “standard” or “normal” to “best”. As I said at the start of the post, I’ve heard just about every potential variation of this, and I just don’t care. They are all the same thing. They are all best practices. It’s just words. Yes, I know what “best” means, but that’s irrelevant. This is a commonly used term in tech and you aren’t getting rid of it, so own it!

I’ve seen people weaponize best practices. OK. So are you saying they won’t weaponize “standard practices” or “normal practices”? They won’t ever say, “So are you telling me you went against normal practices?”. Of course they will. Stupid people/companies will do stupid things regardless of the name.

But it’s not the “best”! Did you even read my post? I’m so tired of this. It’s a best practice to never use hints in SQL. I think that’s pretty solid advice. I do use hints in some SQL, but I always include a comment to explain why. I have deviated from best practice, but documented the reason why. If a person/company wants no deviation from best practice, they can remove it and have shit performance. That’s their choice. I’ve been transparent and explained my deviation. If this is not the way you work, you are wrong, not the best practice.

Most vendor best practice documents are crap. I have some sympathy for this, but I raise tickets against bad documentation, including best practices, and generally the reception to these has been good. The last one was a couple of weeks ago and the company (not Oracle) changed the docs the same day. I always recommend raising an SR/ticket/bug against bad documentation. It doesn’t take much time and you are improving things for yourself and everyone else. I feel like you can’t complain about the quality of the docs if you never point out the faults.


In Defence of Best Practices was first posted on October 18, 2019 at 9:38 am.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.

Virtualbox 6.0.14

Virtualbox 6.0.14 was released recently.

The downloads and changelog are in the usual places.

I’ve done the install on Windows 10, macOS Catalina and Oracle Linux 7 hosts with no drama.

If I’m being super picky, the scaling on Windows 10 is kind-of wacky.

https://oracle-base.com/blog/wp-content/uploads/2019/10/virtualbox-scree... 300w, https://oracle-base.com/blog/wp-content/uploads/2019/10/virtualbox-scree... 768w, https://oracle-base.com/blog/wp-content/uploads/2019/10/virtualbox-scree... 1200w, https://oracle-base.com/blog/wp-content/uploads/2019/10/virtualbox-scree... 1221w" sizes="(max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 1362px) 62vw, 840px" />

Having said that, it seems loads of apps have weird issues when dealing with a laptop with a 4K panel, and a secondary monitor that’s not 4k (it’s 3440×1440).

I rarely open the gui, as I do most things with Vagrant, version 2.2.6 out now, so after this post I will probably forget this was even an issue… </p />
</p></div>

    	  	<div class=

Clustering_Factor

Originally drafted July 2018

“How do you find out what the clustering_factor of an index would be without first creating the index ?”

I’m not sure this is really a question worth asking or answering[1], but since someone asked it (and given the draft date I have no idea who, where, when or why), here’s an answer for simple heap tables in the good old days before Oracle made public the table_cached_blocks preference. It works by sorting the columns you want in the index together with the table rowid, and then comparing the file/block component of the rowid (cutting the relevant characters from the string representation of the rowid) with the previous one to see if the current row is in the same block as the previous row.  If the row is in a different block we count one, otherwise zero.  Finally we sum the ones.

In the demo I’ve copied a little data from view all_objects, with the intention of creating an index on object_name. After running my counting query I’ve then created the index and checked its clustering_factor to show that I’ve got a match.


rem
rem     Script: clustering_factor_est.sql
rem     Author: J.P.Lewis
rem     Dated:  July 2018
rem

create table t1
as
select  *
from    all_objects
where   rownum <= 10000
;

prompt  ====================
prompt  Tablescan with lag()
prompt  ====================

select
        sum(count_chg)
from    (
        select
                case
                        when substrb(rowid,1,15) <> lag(substrb(rowid,1,15),1,'000000000') over(order by object_name, rowid)
                                then 1
                                else 0
                end     count_chg
        from
                t1
        where
                object_name is not null
        )
;

prompt  ======================
prompt  create index and check
prompt  ======================

create index t1_i1 on t1(object_name);

select  clustering_factor
from    user_indexes 
where   table_name = 'T1'
and     index_name = 'T1_I1'
;

Pasting the output from running the above:


Table created.

====================
Tablescan with lag()
====================

SUM(COUNT_CHG)
--------------
          3901

1 row selected.

======================
create index and check
======================

Index created.


CLUSTERING_FACTOR
-----------------
             3901

1 row selected.


This will work for a global index on a partitioned table, but will give meaningless answers for globally partitioned indexes and local indexes. Furthermore it’s not a mechanism that lends itself to calculating the clustering_factor if you’ve set the table_cached_blocks preference to a value other than 1.

[1] Given the amount of work it would take to run the query to check the clustering_factor you might as well create the index (invisible, perhaps and nologging if that doesn’t cause problems) – which also gives you the option for setting the table_cached_blocks and gathering_stats to see how the clustering_factor varies.

Update (shortly after posting)

It has occurred to me that if you have a “nosegment” index that you’ve been using to test whether or not the optimizer would use it IF you created it, then the one piece of information that is missing from the data dicionary for that index is its clustering_factor (For a multicolumn index you can get a valid value for distinct_keys by creating a column group on the set of columns – which would otherwise be faked when you create the nosegment index.) This code might allow you to write a clustering_factor to the index definition that would give you a better idea of whether the optimizer would use it if you created it.  (To be tested.)

Update (a few days later)

I’ve modified the principle query in three ways:

  • changed the substr() to substrb()
  • changed the extract from the rowid to be (1,15) instead of (7,9) – so that I have the (data_object_id, file_id, block_id) not just (file_id, block_id)
  • added a (redundant) predicate  “object_name is not null” to the inline query

 

 

 

Oracle on Azure- Options, Options, Options

I’ve been very busy allocating 60% of my time towards Oracle on Azure migrations.  The biggest challenge right now isn’t getting Oracle on Azure, but keeping my percentage of time allocated to only 60%.

I love Oracle technology-  yes, a Microsoft employee who LOVES Microsoft and Azure cloud, also LOVES. Oracle.  I won’t apologize and I know I’m not the only one.   Every company I’ve worked for, even when I worked at Oracle, required my multi-platform skill set.  I’ve never worked anywhere that had ONLY Oracle, only Linux, only Microsoft, only SQL Server, etc.  I saw it only as an opportunity for all of us and I really don’t have the time to worry if there are individuals are upset by it.

The More Things Change…

Running Oracle on Azure VM environments aren’t that different from running Oracle on VMs in your on-premises for a DBA.  The DBAs and developers that I work with still have their jobs, still work with their favorite tools and also get the chance to learn new skills such as cloud administration.

https://dbakevlar.com/wp-content/uploads/2019/10/B450580E-3008-4C6E-8BEC... 300w, https://dbakevlar.com/wp-content/uploads/2019/10/B450580E-3008-4C6E-8BEC... 768w" sizes="(max-width: 800px) 100vw, 800px" />

Adding a viable cloud  such as Azure results in infrastructure, features and applications to create a full ecosystem that removes demands such as a data center, costs for hardware that will only age, less easier scaling.

When Complex, Automate

To help simplify the process, I am constantly authoring scripts to deploy front he Azure cloud.  I write all my scripts in BASH, so if they want to deploy an Oracle VM, I make it really simple with one script that asks what version of Oracle is desired and the script sets everything else up so the DBA can get to the important part of creating a database.

I love that I can work with customers and open up this new world for them.  I really, really do.  Many customers love their Oracle databases and if they want to keep them and have a desire to move to Azure, I can migrate their Oracle databases over with everything else.  I don’t feel pressure to migrate them off of Oracle.

I’ll be meeting with five customers in the next 24 hrs, in person, to present on all the Oracle on Azure that we can support, including the Oracle and Azure cloud partnership, Oracle on Azure VMs, options for Oracle applications and middleware and architecture changes when moving to the cloud.  Its going to be AWESOME.



Tags:  , ,


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright ©  [Oracle on Azure- Options, Options, Options], All Right Reserved. 2019.

Which Version Number Corresponds Which PeopleTools Object?

Recently somebody asked me "Our “CRM” values in PSLOCK and PSVERSION are growing tremendously and we don’t know why. We will be opening a case with Oracle but … do you know what that “CRM” value is related to? We are not using the CRM product in our PeopleSoft installation."
There is some documentation on the Oracle Support website, but it is not exhaustive and may not be completely up to date.

The page in the PTRef utility that describes the relationship of version numbers to PeopleTools tables is one of the few static pages in the tool.  I have now updated it with the information in the above Oracle support notes, but there are other version numbers.
In the previous blog post, I showed how to increment version numbers before updating PeopleTools objects.  I knew RDM (the Record Definition Manager) is the OBJECTTYPENAME for PSRECDEFN because I worked that out by tracing Application Designer while it saved a record change.  That remains the only completely reliable way to determine the relationship.
However, version number matching is also a useful technique, though when it does not provide a completely definitive answer, it dramatically cuts down the amount of work then necessary.
I have written a little PL/SQL code, that is delivered with PTRef, that extracts the maximum version number for each PeopleTools table in PeopleSoft (other than the PS%DEL) tables and stores it on a working storage table (I used the PLAN_TABLE because it is always there on an Oracle database).  Then you can compare the version number on PSLOCK/PSVERSION with the maximum version on the PeopleTools object.
If the version number is 1, you can’t do the match because the version number has never been incremented, at least since it was last reset by the VERSION application engine.
If the version is only updated occasionally you may have some trouble with duplicate matches. In my example, 3 tables have a maximum version number of 80, while only one version number is 80.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">RECNAME                   MAX        CNT
------------------ ---------- ----------
PSAEAPPLDEFN 80 3504
PSMSGNODEDEFN 80 78
PSOPRVERDFN 80 1468

I would guess that OPRVM matches PSOPRVERDFN, and the above support notes confirm this, but otherwise, you would have to check that manually with trace.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">OBJECTTY    VERSION
-------- ----------
OPRVM 80

Higher version numbers are easier to match because they are less likely to have duplicate matches.
So to return to the original question, what is CRM?  In my sample system, version numbers CRM and SCATM are both 3.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">OBJECTTY    VERSION
-------- ----------
CRM 3
SCATM 3

However, only PSCONTDEFN has a maximum version of 3.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">RECNAME                   MAX        CNT
------------------ ---------- ----------
PSCONTDEFN 3 7567

Again not a perfect match, but again Oracle Support Note 664848.1 confirms that CRM corresponds to PSCONTDEFN.  CRM stands for Content Registry Manager.
So the question now becomes what is updating the content definitions, and hence increasing the version number?  It turned out to be an entity registry synchronisation process that was run daily.
It is perfectly legitimate for many updated rows on the PeopleTools table can be given the same version number.  The version number does not need to be incremented again for each row being updated, but then the row-level locks on PSVERSION and PSLOCK created by their updates must be held until the update on the PeopleTools table is committed.  That can increase contention on the version number update.  The alternative is to commit after each update and then increment the version numbers again.  Many PeopleSoft processes do exactly that, and it can, in turn, lead to massive increase in some version numbers.