Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Monitor your lockdown profiles

Most administrators are now aware of the lockdown profile facility in the Oracle database (available since 12c). They allow a more granular level of privilege control of components and functionality within the database. I’m not going to spend more time than that on what they do – this post is just to highlight a small “gotcha” that you need to be aware of when farming out lockdown profiles to pluggable databases.

For example, perhaps one your pluggable databases should not be using UTL_HTTP and you’ve created a lockdown profile called “P1” to handle that. You would then set that profile at the PDB level via something such as:


SQL> conn / as sysdba
Connected.

SQL> alter session set container = pdb1;

Session altered.

SQL> alter system set pdb_lockdown='P1' scope=both;

System altered.

Now we have created a linkage between the existence of a database object (the lockdown profile in DBA_LOCKDOWN_PROFILES) and an initialization parameter in the spfile. What happens if that linkage is broken? Lets find out Smile.

I’ll jump back into the container database and drop the profile.


SQL>  conn / as sysdba
Connected.

SQL>  drop lockdown profile p1;

Lockdown Profile dropped.

Our pluggable database PDB1 now has an invalid initialisation parameter, so what will happen if you try to restart that pluggable database?


SQL> alter session set container = pdb1;

Session altered.

SQL> shutdown
Pluggable Database closed.
SQL> startup
Pluggable Database opened.

As you can see … nothing (bad) happens. Well, nothing immediately visible to you as a DBA that would make you take some reparative action. Administrators may be familiar with the warning message you might get when a pluggable database is opened with the “alter” command, ie


SQL> alter pluggable database MYPD open;

Warning: PDB altered with errors.

but if you try that approach in this instance, no such warning is returned.


SQL> alter pluggable database pdb1 close;

Pluggable database altered.

SQL> alter pluggable database pdb1 open;

Pluggable database altered.

So how do you even know if you are impacted by this? There are two places you can see this issue. First of these is the alert log


PDB1(3):***************************************************************
PDB1(3):WARNING: Pluggable Database PDB1 with pdb id - 3 is
PDB1(3):         altered with errors or warnings. Please look into
PDB1(3):         PDB_PLUG_IN_VIOLATIONS view for more details.
PDB1(3):***************************************************************
2020-01-13T15:49:19.449164+08:00
PDB1(3):Opening pdb with Resource Manager plan: DEFAULT_MAINTENANCE_PLAN
Pluggable database PDB1 opened read write
PDB1(3):Completed: ALTER PLUGGABLE DATABASE  OPEN 

which also tells us the other location that the warning will be logged, namely PDB_PLUG_IN_VIOLATIONS


SQL> select * from PDB_PLUG_IN_VIOLATIONS
  2  @pr
==============================
TIME                          : 13-JAN-20 03.49.17.957000 PM
NAME                          : PDB1
CAUSE                         : Lockdown Profile
TYPE                          : WARNING
ERROR_NUMBER                  : 0
LINE                          : 1
MESSAGE                       : Lockdown Profile P1 set for the PDB does not exist in CDB$ROOT.
STATUS                        : PENDING
ACTION                        : Set the pdb_lockdown parameter in the PDB to a valid lockdown profile 
                                or create a lockdown profile in CDB$ROOT container with the same name 
                                as pdb_lockdown parameter in the PDB.
CON_ID                        : 3

TL;DR

If you are using lockdown profiles for your pluggable database, make sure you keep an eye on PDB_PLUG_IN_VIOLATIONS or the alert log to make sure DBA_LOCKDOWN_PROFILES and the SPFILE entries are in sync.

Vagrant tips'n'tricks: changing /etc/hosts automatically for Oracle Universal Installer

Oracle Universal Installer, or OUI for short, doesn’t at all like it if the hostname resolves to an IP address in the 127.0.0.0/0 range. At best it complains, at worst it starts installing and configuring software only to abort and bury the real cause deep in the logs.

I am a great fan of HashiCorp’s Vagrant as you might have guessed reading some of the previous articles, and as such wanted a scripted solution to changing the hostname to something more sensible before I begin provisioning software. I should probably add that I’m using my own base boxes; the techniques in this post should equally apply to other boxes as well.

Each of the Vagrant VMs I’m creating is given a private network for communication with its peers. This is mainly done to prevent me from having to deal with port forwarding on the NAT device. If you haven’t used Vagrant before you might not know that by default, each Vagrant VM will come up with a single NIC that has to use NAT. The end goal for this post is to ensure that my VM’s hostname maps to the private network’s IP address, not 127.0.0.1 as it would normally do.

Setting the scene

By default, Vagrant doesn’t seem to mess with the hostname of the VM. This can be changed by using a configuration variable. Let’s start with the Vagrantfile for my Oracle Linux 7 box:

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|
  config.vm.define "ol7guest" do |u|
    # this is a base box I created and stored locally
    u.vm.box = "oracleLinux7Base"

    u.ssh.private_key_path = "/path/to/key"

    u.vm.hostname = "ol7guest"
    u.vm.network "private_network", ip: "192.168.56.204"

    u.vm.provider "virtualbox" do |v|
      v.memory = 2048
      v.name = "ol7guest"
      v.cpus = 1
    end
  end
end 

Please ignore the fact that my Vagrantfile is slightly more complex than it needs to be. I do like having speaking names for my VMs, rather than “default” showing up in vagrant status. Using this terminology in the Vagrantfile also makes it easier to add more VMs to the configuration should I so need.

Apart from you just read the only remarkable thing to mention about this file is this line:

    u.vm.hostname = "ol7guest"

As per the Vagrant documentation, I can use this directive to set the hostname of the VM. And indeed, it does:

$ vagrant ssh ol7guest
Last login: Thu Jan 09 21:14:59 2020 from 10.0.2.2
[vagrant@ol7guest ~]$  

The hostname is set, however it resolves to 127.0.0.1 as per /etc/hosts:

[vagrant@ol7guest ~]$ cat /etc/hosts
127.0.0.1    ol7guest    ol7guest
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6 

Not quite what I had in mind, but apparently expected behaviour. So the next step is to change the first line in /etc/hosts to match the private IP address I assigned to the second NIC. As an Ansible fan I am naturally leaning towards using a playbook, but I also understand that not everyone has Ansible installed on the host and using the ansible_local provisioner might take longer than necessary unless your box has Ansible pre-installed.

The remainder of this post deals with an Ansible solution and the least common denominator, the shell provisioner.

Using an Ansible playbook

Many times I’m using Ansible playbooks to deploy software to Vagrant VMs anyway, so embedding a little piece of code into my playbooks to change /etc/hosts isn’t a lot of work. The first step is to amend the Vagrantfile to reference the Ansible provisioner. One possible way to do this in the context of my example is this:

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|
  config.vm.define "ol7guest" do |u|
    # this is a base box I created and stored locally
    u.vm.box = "oracleLinux7Base"

    u.ssh.private_key_path = "/path/to/key"

    u.vm.hostname = "ol7guest"
    u.vm.network "private_network", ip: "192.168.56.204"

    u.vm.provision "ansible" do |ansible|
      ansible.playbook = "change_etc_hosts.yml"
      ansible.verbose = "v"
    end

    u.vm.provider "virtualbox" do |v|
      v.memory = 2048
      v.name = "ol7guest"
      v.cpus = 1
    end
  end
end  

It is mostly the same file with the addition of the call to Ansible. As you can imagine the playbook is rather simple:

---
- hosts: ol7guest
  become: yes
  tasks:
  - name: change /etc/hosts
    lineinfile:
      path: '/etc/hosts'
      regexp: '.*ol7guest.*' 
      line: '192.168.56.204   ol7guest.example.com   ol7guest' 
      backup: yes

It uses the lineinfile module to find lines containing ol7guest and replaces that line with the “correct” IP address. The resulting hosts file is exactly what I need:

[vagrant@ol7guest ~]$ cat /etc/hosts
192.168.56.204   ol7guest.example.com   ol7guest
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
[vagrant@ol7guest ~]$ 

The first line of the original file has been replaced with the private IP which should enable OUI to progress past this potential stumbling block.

Using the shell provisioner

The second solution involves the shell provisioner, which – unlike Ansible – isn’t distribution agnostic and needs to be tailored to the target platform. On Oracle Linux, the following worked for me:

# -*- mode: ruby -*-
# vi: set ft=ruby :

$script = <<-SCRIPT
/usr/bin/cp /etc/hosts /root && \
/usr/bin/sed -ie '/ol7guest/d' /etc/hosts && \
/usr/bin/echo '192.168.56.204 ol7guest.example.com ol7guest' >> /etc/hosts
SCRIPT

Vagrant.configure("2") do |config|
  config.vm.define "ol7guest" do |u|
    # this is a base box I created and stored locally
    u.vm.box = "oracleLinux7Base"

    u.ssh.private_key_path = "/path/to/key"

    u.vm.hostname = "ol7guest"
    u.vm.network "private_network", ip: "192.168.56.204"

    u.vm.provision "shell", inline: $script

    u.vm.provider "virtualbox" do |v|
      v.memory = 2048
      v.name = "ol7guest"
      v.cpus = 1
    end
  end
end 

The script copies /etc/hosts to root’s home directory and then changes it to match my needs. At the end, the file is in exactly the shape I need it to be in.

Summary

Whether you go with the shell provisioner or embed the change to the hostname in an (existing) Ansible playbook doesn’t matter much. I would definitely argue in support of having the code embedded in a playbook if that’s what will provision additional software anyways. If installing Ansible on the host isn’t an option, using the shell as a fallback mechanism is perfectly fine, too. Happy hacking!

Thank You

Regular readers will have noticed that for the last few weeks I’ve been adding a footnote (now deleted) to any new blog posts asking for donations to the Nepal Youth Foundation (UK), with an offer to match (up to a limit) any donations made by my readers.  The page I had set up on JustGiving to collect donations and describe the work of the foundation reached my limit last week, so I sent off my matching cheque on 6th Jan. I got a very prompt email telling me the cheque had arrived the following day, and thanking me for the effort; then yesterday a follow-up thank you letter arrived in the morning post.

I thought I’d share this with you so that you could see that your generosity has been appreciated by an organisation that really cares about the change it can make to the lives of the people who most need their help:

https://jonathanlewis.files.wordpress.com/2020/01/nyf.jpg?w=2048&h=2894 2048w, https://jonathanlewis.files.wordpress.com/2020/01/nyf.jpg?w=106&h=150 106w, https://jonathanlewis.files.wordpress.com/2020/01/nyf.jpg?w=212&h=300 212w, https://jonathanlewis.files.wordpress.com/2020/01/nyf.jpg?w=768&h=1086 768w, https://jonathanlewis.files.wordpress.com/2020/01/nyf.jpg?w=724&h=1024 724w" sizes="(max-width: 1024px) 100vw, 1024px" />

My JustGiving page is still live, but if you feel at any time that you’d like to donate to a worthwhile cause the NYF(UK) has a JustGiving page of its own that you could donate to directly, and for further information about what they do, here’s a link to their main (UK) website. For other countries there’s a link to the US site with information about other international offices.

(this is an answer to Jeff Potter —  “3 Reasons I Hate Booleans In Databases”)

(this is an answer to Jeff Potter — “3 Reasons I Hate Booleans In Databases”)

I’ll start with the “benchmark” because I like facts.

Testcase

Here is your test which I ran on less number of rows (because it is not needed and easier to run and share from db<>fiddle) and I’ve run the queries once before in order to warm-up the cache. And I displayed the execution plan to get better understanding about the response time:

https://dbfiddle.uk/?rdbms=postgres_12&fiddle=d6e7789dfca6e314cc741f16573b030c

What you did here by replacing the 2-state boolean with a N-state timestamp is that you completely confused the query planner heuristics. Look at the second execution plan: 489 rows estimated instead of 49940. And then the optimizer choose a different plan which is not optimal here (220.956 seconds instead of 27.429)

Now, run the same with an analyze so that the cost based optimizer has more info about the number of nulls in your column. You overload the statistics metadata with many unneeded timestamps but at least the estimation is ok:

https://dbfiddle.uk/?rdbms=postgres_12&fiddle=37849052cbf703e83ad53063a3db6c57

Now the estimation is fine and if you go to the db<>fiddle you can see why: the “null_frac” in “pg_stat” shows how many nulls you have. You can see the many “most_common_vals” that are now stored in the dictionary. And they will probably never be useful as your goal is to query on nulls or not nulls only.

Now that you have a correct execution plan you can see that it is exactly the same for your two queries: full table scan, which is the most efficient when reading 50% of the rows. No need to compare the response time: it is exactly the same amount of work done.

A better test would test on two different tables. And vacuum them as in real life you don’t expect 100% rows out of the visibility map. You will see an Index Only Scan and there, of course, very little difference. But anyway, this is not the model you expect in real life. You will probably never create a full index on one boolean column only. Either you want quick access to the few flagged rows, and that’s a partial index. Or you will just combine this boolean column in addition to other columns where you have a selective predicate.

lack critical information

Your physical data model has to store what you need, all what you need, and only what you need. There’s nothing like a generic domain model when going to platform-specific implementation. Your data model is designed for your use-cases. If you have a “Persons” table and you want to know who is married or not, you add this information as a boolean because that’s what you asked to your the user: “check the box if you are married”. You do not store their wedding date (which is actually the timestamp related to the state). And if you want to know when they entered this information, then you have probably a “last_modification” column for the whole record. And anyway, the database stores the state history (for recovery purpose) and can store it automatically for business purposes (triggers, temporal tables,…).

If you need this information, either you rely on what the database provides or you log/audit them. Like what you mention with “state transition logging”. But not for each column and each boolean! If you go that way, then what is the rationale behind storing a timestamp with “ User.is_email_confirmed” and not with “ User.email” to know when they changed their e-mail?

There is overhead everywhere by replacing a simple “True” by a timestamp. The optimizer statistics above was just an example. Think about CPU cycles needed to test a boolean vs. a datatype with calendar semantic. Think about the space it takes in a row, which can then cross the limit where data stays in cache or not (for all levels of cache).

By the way, a boolean can be nullable, which means that it can have 3 values. You may want to store the information that you don’t know yet if the value is true or false. By replacing it with a timestamp, you pervert the semantic of NULL: rather than indicating the absence of value, it now holds the value “False”.

poorly conceived state machines

Your third point is about the data model. Yes, from the relational theory point of view the need for a boolean datatype can be discussed. The boolean state should be implemented by the presence of a row in a fact table. Your first example about “User.is_email_confirmed” should probably go to a table that logs the confirmation (with a timestamp, maybe the IP address of the sender, …). But beyond the theory, let’s be pragmatic. One day, for legal reasons (like GDPR) you will have to remove this logged information and you will still need a boolean to replace what you removed. The boolean then is derived information required in the physical data model for implementation reasons.

Of course, if you need more values, like “Created -> Pending -> Approved -> Completed” in your example, you need another datatype. You suggest a NUMBER but you don’t actually need number semantic (like doing arithmetic on them). It can be a CHAR but you don’t need character semantic (like character set). The best solution depends on the database you use. PostgreSQL has an ENUM datatype. The most important if you use a CHAR or NUMBER is to have a check constraint so that the optimizer knows the valid values when estimating the cardinalities.

Finally

The funny thing is that I’m not advocating for boolean datatypes at all here. I’ve been working 20 years on Oracle which does not have boolean columns and I never have seen the need for it for a table column. A CHAR(1) NOT NULL CHECK IN(‘Y’,’N’) is ok for me. The problem comes with views and resultsets because you need to define the correspondence with the database client program. But Oracle provides PL/SQL to encapsulate database services in stored procedures, and this has booleans (and many non-relational data types).

And sorry for the long answer but I didn’t want to just add a “I disagree on all” without explanation ;)

Collection limitation

The ODC SQL and PL/SQL forum came up with an example a couple of days ago that highlighted an annoying limitation in the optimizer’s handling of table functions. The requirement was for a piece of SQL that would generate “installments” information from a table of contract agreements and insert into another table any installments that were not yet recorded there.

The mechanism to turn a single row of contract data into a set of installments was a (optionally pipelined) table function that involved some business logic that (presumably) dealt with the timing and size of the installments. The final SQL to create the data that needed to be inserted was reported as follows (though it had clearly been somewhat modified):

select 
        y.*,
        trunc(sysdate,'mm'),
        user
from 
        table_a a
join 
        table(
                my_function(
                        a.loan_acct_nbr, 
                        a.start_dt,
                        a.maturity_dt,
                        a.num_of_terms
                )
        ) y
on 
        a.loan_acct_nbr = y.loan_acct_nbr
where 
        a.yr_mnth_cd = to_char(add_months(sysdate,-1),'yyyymm')       -- last month
and     a.loan_typ   = 'ABC'
and     not exists ( 
                select  1 
                from
                        final_load_table l
                where
                        l.loan_acct_nbr = y.loan_acct_nbr
                and     l.yr_mnth_cd    = y.yr_mnth_cd 
        )
;

In this query, table_a is the set of contracts, final_load_table is the set of installments, and my_function() is the pipelined function returning a table of installments derived from the start date, maturity date, and number of installments for a contract. The query needs a “not exists” subquery to eliminate any installments that are already known to the database. Once this query is operating efficiently it could be used either to drive a PL/SQL loop or (generally the better strategy) to do a simple “insert as select”.

We were told that the function would return around 60 rows per contract; that the starting size of the final_load_table would be about 60M rows, and the size of the result set from the initial join would be about 60M or 70M rows (which suggests about 1M rows acquired from table_a).

The owner of this problem seemed to be saying that the query had returned no data after 4 or 5 hours – which suggests (a) the optimizer chose a bad plan and/or (b) the PL/SQL function is working very inefficiently and/or (c) the join had generated a vast amount of data but the effect of the subquery was to discard all of it .

Clearly we needed to see an execution plan (preferably from SQL Monitor) and be given have some idea of how much of the 60M/70M rows predicted for the join would be discarded because it already existed.

The owner did send us an execution plan – which included a very obvious threat and suggested a couple of optimizer errors – but supplied it as a picture rather than a flat text.

https://jonathanlewis.files.wordpress.com/2020/01/collection_plan.png?w=... 150w, https://jonathanlewis.files.wordpress.com/2020/01/collection_plan.png?w=... 300w, https://jonathanlewis.files.wordpress.com/2020/01/collection_plan.png?w=... 768w" sizes="(max-width: 922px) 100vw, 922px" />

You’ll notice, of course, that the tables and columns have changed from the original statement. More significantly, though, there are serious problems with the numbers – the estimated row count for the basic join is only 5,326 rather than 50 Million which, at first sight, is probably why the optimizer has decided that a filter subquery (operation 1) to do an index-only probe (operation 5) is a good way of handling the subquery. Perhaps if the estimates had been a little more accurate (e.g. through the addition of a couple of column groups or, if necessary, by opt_estimate() or cardinality() hints) the subquery would have been unnested and turned into a hash anti-join.

I suggested a quick test of a suitable cardinality() hint – but ran up a little model to check that I’d got the hint right – and found that I had but it wasn’t going to help. So I decided to write up the model (and a possible solution for the owner of the problem) in this blog note.

Here’s the code to create the set of objects I’m going to work with. The naming follows the naming in the original statement of the problem suggested by the owner:


rem
rem     Script:         table_function_plan.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jan 2020
rem     Purpose:        
rem
rem     Last tested 
rem             19.3.0.0
rem             12.2.0.1
rem

create table table_a (
        loan_acct_nbr   number,
        yr_mnth_cd      varchar2(6),
        start_dt        date,
        maturity_dt     date    ,
        number_of_terms number,
        constraint ta_pk primary key(loan_acct_nbr, yr_mnth_cd)
)
;

execute dbms_random.seed(0)

insert /*+ append */
into    table_a
with generator as (
        select rownum id
        from    dual
        connect by
                level <= 4000 -- > comment to avoid wordpress issue
)
select
        trunc(dbms_random.value(1e9, 2e9)),
        to_char(sysdate-(365-mod(rownum,365)),'yyyymm'),
        sysdate-(365-mod(rownum,365)),
        sysdate+(1500+mod(rownum,365)),
        60
from
        generator       v1,
        generator       v2
where
        rownum <= 1e4 -- > comment to avoid WordPress format issue
;

create table final_load_table_l(
        loan_acct_nbr   number,
        yr_mnth_cd      varchar2(6),
        v1              varchar2(10),
        padding         varchar2(200),
        constraint lt_pk primary key(loan_acct_nbr, yr_mnth_cd)
)
;

insert /*+ append */ into final_load_table_l
with generator as (
        select rownum id
        from    dual
        connect by
                level <= 4000 -- > comment to avoid wordpress issue
)
select
        trunc(dbms_random.value(1e9, 2e9)),
        to_char(sysdate-(365-mod(rownum,365)),'yyyymm'),
        lpad(rownum,10),
        lpad('x',200,'x')
from
        generator       v1,
        generator       v2
where
        rownum <= 1e4 -- > comment to avoid WordPress format issue
;


begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'table_a',
                method_opt  => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'final_load_table_l',
                method_opt  => 'for all columns size 1'
        );
end;
/

create type my_row_type as object (
        loan_acct_nbr   number,
        yr_mnth_cd      varchar2(6),
        v1              varchar2(10),
        padding         varchar2(200)
);
/

create type my_table_type as table of my_row_type;
/

create  or replace function my_function (
        i_loan_acct_nbr         in      number,
        i_yr_mnth_cd            in      varchar2,
        i_start_dt              in      date,
        i_maturity_dt           in      date,
        i_number_of_terms       in      number
)       return  my_table_type pipelined
as
begin
        for i in 1..i_number_of_terms loop
                pipe row (
                        my_row_type(
                                i_loan_acct_nbr,
                                to_char(i_start_dt+32*i,'yyyymm'),
                                i,
                                lpad('x',200,'x')
                        )
                );
        end loop;
        return;
end;
/

I was planning to create some large tables – hence the option to generate 16M rows from my generator CTEs – but I discovered the critical problem almost as soon as I had some data and code in place, so I didn’t need to go large.

I’ve had to create an object type and table type in order to create a pipelined function that returns the table type by piping rows of the object type. The data I’ve created, and the way the function generates data probably doesn’t bear much resemblance to the real system of course, but I don’t think it needs to be very close to make the crucial point.

Here’s the text of the select statement the OP wants to run, with the execution plan I got from my data set after running the query and pulling the plan from memory:

alter session set statistics_level = all;

select 
        /*+ find this 1 */
        y.* 
from 
        table_a a, 
        table(my_function(
                a.loan_acct_nbr,
                a.yr_mnth_cd,
                a.start_dt,
                a.maturity_dt,
                a.number_of_terms
        )) y
where 
        a.yr_mnth_cd = '202001'
and     not exists (
                select  
                        /*+ unnest */
                        null
                from    final_load_table_l l
                where   l.loan_acct_nbr = y.loan_acct_nbr
                and     l.yr_mnth_cd    = y.yr_mnth_cd
        )
;

--------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name        | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |             |      1 |        |  5059K(100)|  14580 |00:00:00.15 |   16330 |
|*  1 |  FILTER                             |             |      1 |        |            |  14580 |00:00:00.15 |   16330 |
|   2 |   NESTED LOOPS                      |             |      1 |   6283K| 10421   (8)|  14580 |00:00:00.11 |     335 |
|*  3 |    TABLE ACCESS FULL                | TABLE_A     |      1 |    769 |    10  (10)|    243 |00:00:00.01 |     297 |
|   4 |    COLLECTION ITERATOR PICKLER FETCH| MY_FUNCTION |    243 |   8168 |    14   (8)|  14580 |00:00:00.10 |      38 |
|*  5 |   INDEX UNIQUE SCAN                 | LT_PK       |  14580 |      1 |     1   (0)|      0 |00:00:00.02 |   15995 |
--------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( IS NULL)
   3 - filter("A"."YR_MNTH_CD"='202001')
   5 - access("L"."LOAN_ACCT_NBR"=:B1 AND "L"."YR_MNTH_CD"=:B2)


I’ve put in a hint to tell the optimizer to unnest the subquery – and it didn’t. Oracle does not ignore hints (unless they’re illegal, or out of context, or the optimizer never got to them, or you’ve found a bug) so why did Oracle appear to ignore this hint? There’s a really nice feature in 19.3 execution plans – you can request a hint report for a plan, and here’s the hint report for this query (ignore the bits about “find this” being an error):


Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 3 (U - Unused (1), E - Syntax error (2))
---------------------------------------------------------------------------
   0 -  SEL$1
         E -  find
         E -  this

   5 -  SEL$3
         U -  unnest / Invalid correlated predicates


I put in an /*+ unnest */ hint to unnest the subquery, and I’ve been told that the predicates are not valid. The only thing about them that could be invalid is that they come from a pipelined function that has returned an object type. The pipelined function does not behave exactly like a table. But this gives me a clue about forcing the unnest to happen – hide the pipelined function inside a non-mergeable view.


select
        /*+ find this 2 */
        v.*
from    (
        select  /*+ no_merge */
                y.* 
        from 
                table_a a, 
                table(my_function(
                        a.loan_acct_nbr,
                        a.yr_mnth_cd,
                        a.start_dt,
                        a.maturity_dt,
                        a.number_of_terms
                )) y
        where 
                a.yr_mnth_cd = '202001'
        )       v
where   not exists (
                select
                        /*+ unnest */
                        null
                from    final_load_table_l l
                where   l.loan_acct_nbr = v.loan_acct_nbr
                and     l.yr_mnth_cd    = v.yr_mnth_cd
        )
/


------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name        | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |             |      1 |        | 10628 (100)|  14580 |00:00:00.12 |     387 |       |       |          |
|*  1 |  HASH JOIN RIGHT ANTI                |             |      1 |   6283K| 10628  (10)|  14580 |00:00:00.12 |     387 |  1878K|  1878K| 2156K (0)|
|   2 |   INDEX FAST FULL SCAN               | LT_PK       |      1 |  10000 |     6  (17)|  10000 |00:00:00.01 |      91 |       |       |          |
|   3 |   VIEW                               |             |      1 |   6283K| 10371   (8)|  14580 |00:00:00.11 |     296 |       |       |          |
|   4 |    NESTED LOOPS                      |             |      1 |   6283K| 10371   (8)|  14580 |00:00:00.10 |     296 |       |       |          |
|*  5 |     TABLE ACCESS FULL                | TABLE_A     |      1 |    769 |    10  (10)|    243 |00:00:00.01 |     296 |       |       |          |
|   6 |     COLLECTION ITERATOR PICKLER FETCH| MY_FUNCTION |    243 |   8168 |    13   (0)|  14580 |00:00:00.10 |       0 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("L"."LOAN_ACCT_NBR"="V"."LOAN_ACCT_NBR" AND "L"."YR_MNTH_CD"="V"."YR_MNTH_CD")
   5 - filter("A"."YR_MNTH_CD"='202001')

I’ve wrapped the basic join between table_a and the table function in an inline view called (unimaginatively) v, and I’ve added a /*+ no_merge */ hint to that inline view. So the main query becomes a select from a non-mergeable view with a “not exists” subquery applied to a couple of view columns. And Oracle thinks that that’s okay – and my unnest hint encouraged the optimizer to use a hash anti-join.

So here’s an option for the owner of the problem – but with a few outstanding questions: will a rewrite of their query in this form do the same, will the performance of the select be good enough, and will the subsequent “insert as select” keep the same driving plan.

There’s one significant detail to worry about: the build table in this hash (anti-)join is going to be constructed from 50M (load_acct_bfr, yr_mnth_cd) pairs – which means the workarea requirement is likely to be about 1.2GB for an optimial (i.e. in-memory) hash join; otherwise the join may spill to disc and do a lot of I/O – probably as a one-pass hash join.

(Did you notice,by the way, that the word storage appeared at operation 3 in the original plan?  That suggests a nice big Exadata box; however, there’s no storage predicate in the Predicate Information section for that operation and you would have thought that lease_type = ‘US_SSP’ would be pushed to storage, so maybe this is a ZFS Pillar backing a less exotic server.)

Conclusion

Some (if not all) types of correlated subqueries behave badly if the correlation predicates involve table functions. But you may be able to work around the issue by hiding part of the query, including the table function, inside a non-mergeable inline view before applying the subquery to the view.

Footnote

When I realised that the table function was causing a problem unnesting I remembered that I had written about a similar problem a few years ago – after searching for a little while I discovered a draft note that I had started in September 2015 but had not quite finished; so I’ll be finishing it off and publishing it some time in the next few days. (Update: now published)

 

 

Automating SQL and PL/SQL Deployments using Liquibase

https://oracle-base.com/blog/wp-content/uploads/2019/12/sqlcl-295x300.png 295w, https://oracle-base.com/blog/wp-content/uploads/2019/12/sqlcl-57x57.png 57w" sizes="(max-width: 189px) 85vw, 189px" />

You’ll have heard me barking on about automation, but one subject that’s been conspicuous by its absence is the automation of SQL and PL/SQL deployments…

I had heard of some products that might work for me, like Flyway and Liquibase, but couldn’t really make up my mind or find the time to start learning them. Next thing I knew, SQLcl got Liquibase built in, so I figured that was probably the decision made for me in terms of product. This also coincided with discussions about making a deployment pipeline for APEX applications, which kind-of focused me. It’s sometimes hard to find the time to learn something when there is not a pressing demand for it…

Despite thinking I would probably be using the SQLcl implentation, I started playing with the regular Liquibase client first. Kind of like starting at grass roots. If you are working in a mixed environment, you might prefer to use the regular client, as it will work with multiple engines.

Once I had found my feet with that, I essentially rewrote the article to use the SQLcl implementation of Liquibase. If you are focused on Oracle, I think this is better than using the standard client.

Both these articles were written more than 3 months ago, but I was holding them back on publishing them for a couple of reasons.

  1. I’m pretty new to this, and I realise some of the ways I’m suggesting to use them do not fall in line with the way I guess many Liquibase users would want to use them. I’m not trying to make out I know better, but I do know what will suit me. I don’t like defining all objects as XML and the Formatted SQL Changelogs don’t look like a natural way to work. I want the developer to do their job in their normal way as much as possible. That means using DDL, DML and PL/SQL scripts.
  2. I thought there was a bug in one aspect of the SQLcl implementation, but thanks to Jeff Smith, I found out it was a problem between my keyboard and seat. <br />
</li></ol></div>

    	  	<div class=

The time has come …

… the walrus said, to speak of many things!

2019 was for me, in many ways, my “Annis Horribilis”. My mother passed away in early January, I suffered through a fair chunk of the year through chronic pain, and in mid December, my father also passed away.

Given all these things, I have made the relatively easy decision that I would retire, and effective January 6th, I left the workforce. I want to thank publicly my manager, Peter Underwood, who had held my position open for me to return to since I went on medical leave on December 5th, 2018.

So this is the last post that I will make on PeteWhoDidNotTweet. In the next week or so, I plan on removing the social media accounts I’ve been using. The website will remain up and running till the next time I’m supposed to renew it. I’ve set it to not renew, so at some stage, GoDaddy will no doubt expire it. If you need anything from the site, I suggest you copy it sooner rather than later.

I’ve had the most incredible time working with Oracle products, but now it is time for me to down tools and relax and enjoy my retirement.

Good luck with whatever you do in the future. It’s been a pleasure working with (and drinking with!) many of you. But for now, it’s goodbye from me, and it’s goodbye from him!

Pete

The post The time has come … appeared first on PeteWhoDidNotTweet.com.

The time has come …

… the walrus said, to speak of many things!

2019 was for me, in many ways, my “Annis Horribilis”. My mother passed away in early January, I suffered through a fair chunk of the year through chronic pain, and in mid December, my father also passed away.

Given all these things, I have made the relatively easy decision that I would retire, and effective January 6th, I left the workforce. I want to thank publicly my manager, Peter Underwood, who had held my position open for me to return to since I went on medical leave on December 5th, 2018.

So this is the last post that I will make on PeteWhoDidNotTweet. In the next week or so, I plan on removing the social media accounts I’ve been using. The website will remain up and running till the next time I’m supposed to renew it. I’ve set it to not renew, so at some stage, GoDaddy will no doubt expire it. If you need anything from the site, I suggest you copy it sooner rather than later.

I’ve had the most incredible time working with Oracle products, but now it is time for me to down tools and relax and enjoy my retirement.

Good luck with whatever you do in the future. It’s been a pleasure working with (and drinking with!) many of you. But for now, it’s goodbye from me, and it’s goodbye from him!

Pete

The post The time has come … appeared first on PeteWhoDidNotTweet.com.

Single Value Column Frequency Histogram Oracle 12c and later

It is hopefully in the meantime well known that Oracle has introduced in version 11g a new algorithm to gather statistics on a table that no longer requires sorting for determining the critical Number Of Distinct Values (NDV) figure - it instead uses a clever "approximate NDV" algorithm which always reads 100% of the table data and therefore in principle generates very accurate statistics. This new algorithm gets used only when the ESTIMATE_PERCENT parameter to the DBMS_STATS.GATHER*STATS calls is left at default or explicitly passed as "DBMS_STATS.AUTO_SAMPLE_SIZE". This new algorithm is also required in case other new features like "Incremental Statistics" should be used.

In 12c Oracle improved this algorithm allowing the generation of Frequency and the new Top Frequency histogram types in a single pass. The new Hybrid histogram type still requires a separate pass.

11g always required a separate pass per histogram to be created - no matter what type of histogram (in 11g there was only Frequency and Height-Balanced) - which resulted in a quite aggressive sampling used for that purpose to minimize the time and resource usage for those separate passes, typically just using 5,500 rows and only sized up in case there were many NULL values (which is a very small sample size for larger tables). Note that this aggressive sampling only applies to the new "approximate NDV" code path - if you specify any explicit ESTIMATE_PERCENT Oracle uses the old code (which requires sorting for determining the NDV figure) and therefore the separate passes required to generate histograms are based on the same sampling percentage as used for the basic table and column statistics - actually Oracle can create a Global Temporary Table in a separate pass covering the required data from several columns in this case to avoid repeatedly scanning the table again and again.

I've recently came across an edge case at a client that showed that the new code has a flaw in the special case of columns that only have a single value on table or (sub)partition level.

First of all in my opinion in this special case of a column having only a single value a (Frequency) histogram doesn't add any value - everything required can be determined from the basic column statistics anyway - low and high value are the same, NDV is 1 and the number of NULLs should be all that is needed for proper cardinality estimates on such columns.

Now the new code path seems to be quite happy to generate histograms on all kinds of columns with low number of distinct values, be it useful or not. Since starting with 12c these histograms should all be covered by the main pass - since they can be represented by Frequency histograms when using the default bucket size of 254 - arguably there isn't much overhead in creating them, so why bother.

However, there is a flaw in the code: When the column has just a single value, then the code for some (unknown) reason determines that it requires a separate pass to generate a histogram and doesn't make use of the information already gathered as part of the main pass - which should hold everything needed.

So Oracle runs a separate pass to gather information for this histogram. Usually this doesn't make much difference, but this separate pass is no longer using the 100% data but resorts to the aggressive sampling as outlined above - if applicable. So usually it might just take those 5,500 rows to create a Frequency histogram on this single value column.

But in the edge case of such a single valued column that is NULL for the majority rows, the code recognizes this and no longer uses the aggressive sampling. Instead - probably depending on the number of NULLs - it needs to read a larger proportion of the table to find some non-NULL data.

In the case of my client this was a very large table, had numerous of such special edge case columns (single valued, NULL for most of the rows) which resulted in dozens of non-sampled full table scans of this very large table taking several days to complete.

When enabling the DBMS_STATS specific tracing the behaviour can be reproduced on the latest available versions (19.3 in my case here) - I've used the following test case to test four different scenarios and how the code behaved:

#eeeeee; border: 1px dashed rgb(204 , 204 , 204); overflow: auto;">set echo on serveroutput on size unlimited lines 800 long 800

select * from v$version;

exec dbms_stats.set_global_prefs('TRACE', 1048575)

-- Testcase 1: Single distinct value, almost all values are NULL except three rows
-- This scenario triggers a separate query to generate a histogram using no (!!) sampling
drop table test_stats purge;

create table test_stats
as
select /*+ NO_GATHER_OPTIMIZER_STATISTICS */
rownum as id,
case when rownum <= 3 then 1 else null end as almost_null
from
(select /*+ cardinality(1000) */ null from dual connect by level <= 1000) a,
(select /*+ cardinality(1000) */ null from dual connect by level <= 1000) b
;

exec dbms_stats.gather_table_stats(null, 'test_stats', method_opt => 'for columns size 1 id for columns size 254 almost_null')

-- Testcase 2: Single distinct value, but only three rows are NULL
-- This scenario triggers a separate query to generate a histogram but using the usual aggressive sampling (5500 rows)
drop table test_stats purge;

create table test_stats
as
select /*+ NO_GATHER_OPTIMIZER_STATISTICS */
rownum as id,
case when rownum > 3 then 1 else null end as almost_null
from
(select /*+ cardinality(1000) */ null from dual connect by level <= 1000) a,
(select /*+ cardinality(1000) */ null from dual connect by level <= 1000) b
;

exec dbms_stats.gather_table_stats(null, 'test_stats', method_opt => 'for columns size 1 id for columns size 254 almost_null')

-- Testcase 3: 11 distinct values, but only three rows are NULL
-- This scenario extracts the values directly from the ROWIDs returned from the main query, no separate query
drop table test_stats purge;

create table test_stats
as
select /*+ NO_GATHER_OPTIMIZER_STATISTICS */
rownum as id,
case when rownum > 3 then trunc(rownum / 100000) else null end as almost_null
from
(select /*+ cardinality(1000) */ null from dual connect by level <= 1000) a,
(select /*+ cardinality(1000) */ null from dual connect by level <= 1000) b
;

exec dbms_stats.gather_table_stats(null, 'test_stats', method_opt => 'for columns size 1 id for columns size 254 almost_null')

-- Testcase 4: Three distinct values in three rows and all other rows have NULL
-- This scenario extracts the values directly from the ROWIDs returned from the main query, no separate query
-- Applies to two distinct values, too
-- So a single distinct value looks like a special case that triggers a separate query
-- If this is with combination of almost all rows having NULLs this query doesn't use sampling
-- ! Big threat if the table is big !
drop table test_stats purge;

create table test_stats
as
select /*+ NO_GATHER_OPTIMIZER_STATISTICS */
rownum as id,
case when rownum <= 2 then rownum else null end as almost_null
from
(select /*+ cardinality(1000) */ null from dual connect by level <= 1000) a,
(select /*+ cardinality(1000) */ null from dual connect by level <= 1000) b
;

exec dbms_stats.gather_table_stats(null, 'test_stats', method_opt => 'for columns size 1 id for columns size 254 almost_null')

And the output from 19.3 looks like this - the critical parts highlighted in red and bold:

#eeeeee; border: 1px dashed rgb(204 , 204 , 204); overflow: auto;">SQL*Plus: Release 11.2.0.1.0 Production on Mon Aug 19 17:07:42 2019

Copyright (c) 1982, 2010, Oracle. All rights reserved.


Verbunden mit:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production


USERNAME INST_NAME HOST_NAME SID SERIAL# VERSION STARTED SPID OPID CPID SADDR PADDR
-------------------- ------------ ------------------------- ----- -------- ---------- -------- --------------- ----- --------------- ---------------- ----------------
CBO_TEST orcl19c DELLXPS13 394 51553 19.0.0.0.0 20190819 4192 59 6232:10348 00007FFDE6AE6478 00007FFDE6573A48



SQL>
SQL> select * from v$version;

BANNER BANNER_FULL BANNER_LEGACY CON_ID
-------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------- ----------
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production 0
Version 19.3.0.0.0


SQL>
SQL> exec dbms_stats.set_global_prefs('TRACE', 1048575)

PL/SQL-Prozedur erfolgreich abgeschlossen.

SQL>
SQL> -- Testcase 1: Single distinct value, almost all values are NULL except three rows
SQL> -- This scenario triggers a separate query to generate a histogram using no (!!) sampling
SQL> drop table test_stats purge;
drop table test_stats purge
*
FEHLER in Zeile 1:
ORA-00942: Tabelle oder View nicht vorhanden


SQL>
SQL> create table test_stats
2 as
3 select /*+ NO_GATHER_OPTIMIZER_STATISTICS */
4 rownum as id,
5 case when rownum <= 3 then 1 else null end as almost_null
6 from
7 (select /*+ cardinality(1000) */ null from dual connect by level <= 1000) a,
8 (select /*+ cardinality(1000) */ null from dual connect by level <= 1000) b
9 ;

Tabelle wurde erstellt.

SQL>
SQL> exec dbms_stats.gather_table_stats(null, 'test_stats', method_opt => 'for columns size 1 id for columns size 254 almost_null')
DBMS_STATS: Record gather table stats operation on table : test_stats
DBMS_STATS: job name:
DBMS_STATS: |--> Operation id: 223
DBMS_STATS: gather_table_stats:
DBMS_STATS: Start gather table stats -- tabname: test_stats
DBMS_STATS: job name:
DBMS_STATS: Started table CBO_TEST.TEST_STATS. at 19.08.19 17:07:43,293000000 +02:00. Execution phase: SERIAL (NON-CONCURRENT) stattype: DATA Reporting Mode: FALSE
DBMS_STATS: Specified granularity = AUTO, Fixed granularity = GLOBAL AND PARTITION
DBMS_STATS: parameters ** pfix: (parent pfix: ) ownname: tabname: test_stats partname: estimate_percent: 101 block_sample: FALSE method_opt: for columns size 1 id for columns size 254 almost_null degree: 32766 granularity: Z gIdxGran: cascade: NULL stattab: statid: statown: no_invalidate: NULL flush_colu: TRUE fxt: stattype: DATA start_time: 08-19-2019 17:07:43 gathering_group_stats: FALSE cms_only: FALSE force: FALSE options: Z executionPhase: SERIAL (NON-CONCURRENT) incremental: FALSE concurrent: FALSE loc_degree: loc_method_opt: for columns size 1 id for columns size 254 almost_null fxt_typ: 0 tobjn: 72916
DBMS_STATS: Preferences for table CBO_TEST.TEST_STATS
DBMS_STATS: ================================================================================
DBMS_STATS: SKIP_TIME -
DBMS_STATS: STATS_RETENTION -
DBMS_STATS: MON_MODS_ALL_UPD_TIME -
DBMS_STATS: SNAPSHOT_UPD_TIME -
DBMS_STATS: TRACE - 0
DBMS_STATS: DEBUG - 0
DBMS_STATS: SYS_FLAGS - 0
DBMS_STATS: SPD_RETENTION_WEEKS - 53
DBMS_STATS: CASCADE - DBMS_STATS.AUTO_CASCADE
DBMS_STATS: ESTIMATE_PERCENT - DBMS_STATS.AUTO_SAMPLE_SIZE
DBMS_STATS: DEGREE - NULL
DBMS_STATS: METHOD_OPT - FOR ALL COLUMNS SIZE AUTO
DBMS_STATS: NO_INVALIDATE - DBMS_STATS.AUTO_INVALIDATE
DBMS_STATS: GRANULARITY - AUTO
DBMS_STATS: PUBLISH - TRUE
DBMS_STATS: STALE_PERCENT - 10
DBMS_STATS: APPROXIMATE_NDV - TRUE
DBMS_STATS: APPROXIMATE_NDV_ALGORITHM - REPEAT OR HYPERLOGLOG
DBMS_STATS: ANDV_ALGO_INTERNAL_OBSERVE - FALSE
DBMS_STATS: INCREMENTAL - FALSE
DBMS_STATS: INCREMENTAL_INTERNAL_CONTROL - TRUE
DBMS_STATS: AUTOSTATS_TARGET - AUTO
DBMS_STATS: CONCURRENT - OFF
DBMS_STATS: JOB_OVERHEAD_PERC - 1
DBMS_STATS: JOB_OVERHEAD - -1
DBMS_STATS: GLOBAL_TEMP_TABLE_STATS - SESSION
DBMS_STATS: ENABLE_TOP_FREQ_HISTOGRAMS - 3
DBMS_STATS: ENABLE_HYBRID_HISTOGRAMS - 3
DBMS_STATS: TABLE_CACHED_BLOCKS - 1
DBMS_STATS: INCREMENTAL_LEVEL - PARTITION
DBMS_STATS: INCREMENTAL_STALENESS - ALLOW_MIXED_FORMAT
DBMS_STATS: OPTIONS - GATHER
DBMS_STATS: GATHER_AUTO - AFTER_LOAD
DBMS_STATS: STAT_CATEGORY - OBJECT_STATS, REALTIME_STATS
DBMS_STATS: SCAN_RATE - 0
DBMS_STATS: GATHER_SCAN_RATE - HADOOP_ONLY
DBMS_STATS: PREFERENCE_OVERRIDES_PARAMETER - FALSE
DBMS_STATS: AUTO_STAT_EXTENSIONS - OFF
DBMS_STATS: WAIT_TIME_TO_UPDATE_STATS - 15
DBMS_STATS: ROOT_TRIGGER_PDB - FALSE
DBMS_STATS: COORDINATOR_TRIGGER_SHARD - FALSE
DBMS_STATS: MAINTAIN_STATISTICS_STATUS - FALSE
DBMS_STATS: AUTO_TASK_STATUS - OFF
DBMS_STATS: AUTO_TASK_MAX_RUN_TIME - 3600
DBMS_STATS: AUTO_TASK_INTERVAL - 900
DBMS_STATS: reporting_man_log_task: target: "CBO_TEST"."TEST_STATS" objn: 72916 auto_stats: FALSE status: IN PROGRESS ctx.batching_coeff: 0
DBMS_STATS: Start construct analyze using SQL .. Execution Phase: SERIAL (NON-CONCURRENT) granularity: GLOBAL AND PARTITION global_requested: NULL pfix:
DBMS_STATS: (Baseline)
DBMS_STATS: STATISTIC LAST_QUERY SESSION_TOTAL
DBMS_STATS: ------------------------------ ---------- -------------
DBMS_STATS: Queries Parallelized 0 0
DBMS_STATS: DML Parallelized 0 0
DBMS_STATS: DDL Parallelized 0 0
DBMS_STATS: DFO Trees 0 0
DBMS_STATS: Server Threads 0 0
DBMS_STATS: Allocation Height 0 0
DBMS_STATS: Allocation Width 0 0
DBMS_STATS: Local Msgs Sent 0 0
DBMS_STATS: Distr Msgs Sent 0 0
DBMS_STATS: Local Msgs Recv'd 0 0
DBMS_STATS: Distr Msgs Recv'd 0 0
DBMS_STATS: DOP 0 0
DBMS_STATS: Slave Sets 0 0
DBMS_STATS: Start gather part -- ctx.conc_ctx.global_requested: NULL gran: GLOBAL AND PARTITION execution phase: 1
DBMS_STATS: APPROX_NDV_ALGORITHM chosen: AS in non-incremental
DBMS_STATS: Start gather_stats.. pfix: ownname: CBO_TEST tabname: TEST_STATS pname: spname: execution phase: 1
Specified DOP=1 blocks=1520 DOP used=1
Specified DOP=1 blocks=1520 DOP used=1
DBMS_STATS: Using approximate NDV pct=0
DBMS_STATS: NNV NDV AVG MMX HST EP RP NNNP IND CNDV HSTN HSTR COLNAME
DBMS_STATS: Y Y Y Y Y ID
DBMS_STATS: Y Y Y Y Y Y Y Y ALMOST_NULL
DBMS_STATS: APPROX_NDV_ALGORITHM chosen: AS in non-incremental
DBMS_STATS: Approximate NDV Options
DBMS_STATS: NDV,NIL,NIL,TOPN,NIL,NIL,RWID,U1,U254U
DBMS_STATS: Starting query at 19.08.19 17:07:43,386000000 +02:00
DBMS_STATS: select /*+ full(t) no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring xmlindex_sel_idx_tbl opt_param('optimizer_inmemory_aware' 'false') no_substrb_pad */to_char(count("ID")),substrb(dump(min("ID"),16,0,64),1,240),substrb(dump(max("ID"),16,0,64),1,240),to_char(count("ALMOST_NULL")),substrb(dump(min("ALMOST_NULL"),16,0,64),1,240),substrb(dump(max("ALMOST_NULL"),16,0,64),1,240),count(rowidtochar(rowid)) from "CBO_TEST"."TEST_STATS" t /* NDV,NIL,NIL,TOPN,NIL,NIL,RWID,U1,U254U*/
DBMS_STATS: Ending query at 19.08.19 17:07:43,589000000 +02:00
DBMS_STATS: STATISTIC LAST_QUERY SESSION_TOTAL
DBMS_STATS: ------------------------------ ---------- -------------
DBMS_STATS: Queries Parallelized 0 0
DBMS_STATS: DML Parallelized 0 0
DBMS_STATS: DDL Parallelized 0 0
DBMS_STATS: DFO Trees 0 0
DBMS_STATS: Server Threads 0 0
DBMS_STATS: Allocation Height 0 0
DBMS_STATS: Allocation Width 0 0
DBMS_STATS: Local Msgs Sent 0 0
DBMS_STATS: Distr Msgs Sent 0 0
DBMS_STATS: Local Msgs Recv'd 0 0
DBMS_STATS: Distr Msgs Recv'd 0 0
DBMS_STATS: DOP 0 0
DBMS_STATS: Slave Sets 0 0
DBMS_STATS: Approximate NDV Result
DBMS_STATS:

0
1000000
1000000
6
1001984
1000000
3979802


1
Typ=2 Len=2: c1,2


2
Typ=2 Len=2: c4,2


3
3
0
1
3
6
1
AAARzUAAPAAAACBAAA,3,


4
Typ=2 Len=2: c1,2



5
Typ=2 Len=2: c1,2



DBMS_STATS: process_raw_crec() for 1
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ID 3979802 1000000 NULL NULL NULL 1001984 NULL 0 0 0
DBMS_STATS: min:
DBMS_STATS: max:
DBMS_STATS: process_raw_crec() for 2
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ALMOST_NULL NULL NULL NULL NULL NULL NULL NULL 7 0 0
DBMS_STATS: min:
DBMS_STATS: max: Typ=2 Len=2: c1,2
DBMS_STATS: process_raw_crec() for 2
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ALMOST_NULL NULL NULL NULL NULL NULL NULL NULL 7 0 0
DBMS_STATS: min: Typ=2 Len=2: c1,2
DBMS_STATS: max: Typ=2 Len=2: c1,2
DBMS_STATS: process_raw_crec() for 1
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ID 3979802 1000000 NULL NULL NULL 1001984 NULL 0 0 0
DBMS_STATS: min: Typ=2 Len=2: c1,2
DBMS_STATS: max:
DBMS_STATS: process_raw_crec() for 1
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ID 3979802 1000000 NULL NULL NULL 1001984 NULL 0 0 0
DBMS_STATS: min: Typ=2 Len=2: c1,2
DBMS_STATS: max: Typ=2 Len=2: c4,2
DBMS_STATS: process_raw_crec() for 2
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ALMOST_NULL 6 3 NULL NULL NULL 1 NULL 7 0 0
DBMS_STATS: min: Typ=2 Len=2: c1,2
DBMS_STATS: max: Typ=2 Len=2: c1,2
DBMS_STATS: start processing top n values for column "ALMOST_NULL"
DBMS_STATS: >> frequency histograms is not feasible
(dbms_lob.getlength(topn_values) <=
length(
)), skip!
DBMS_STATS: no histogram: setting density to 1/ndv (,000001)
DBMS_STATS: no histogram: setting density to 1/ndv (1)
DBMS_STATS: Iteration: 1 numhist: 1
DBMS_STATS: Iteration 1, percentage 100 nblks: 1520
DBMS_STATS: NNV NDV AVG MMX HST EP RP NNNP IND CNDV HSTN HSTR COLNAME
DBMS_STATS: Y ID
DBMS_STATS: Y Y Y Y ALMOST_NULL
Specified DOP=1 blocks=1520 DOP used=1
DBMS_STATS: Building Histogram for ALMOST_NULL
DBMS_STATS: bktnum=254, nnv=3, snnv=3, sndv=1, est_ndv=1, mnb=254
DBMS_STATS: Trying frequency histogram
DBMS_STATS: Starting query at 19.08.19 17:07:43,652000000 +02:00
DBMS_STATS: select substrb(dump(val,16,0,64),1,240) ep, cnt from (select /*+ no_expand_table(t) index_rs(t) no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring xmlindex_sel_idx_tbl opt_param('optimizer_inmemory_aware' 'false') no_substrb_pad */"ALMOST_NULL" val,count(*) cnt from "CBO_TEST"."TEST_STATS" t where "ALMOST_NULL" is not null group by "ALMOST_NULL") order by val
DBMS_STATS: Evaluating frequency histogram for col: "ALMOST_NULL"
DBMS_STATS: number of values = 1, max # of buckects = 254, pct = 100, ssize = 3
DBMS_STATS: csr.hreq: 1 Histogram gathering flags: 1031
DBMS_STATS: Start fill_cstats - hybrid_enabled: TRUE
DBMS_STATS: ====================================================================================================
DBMS_STATS: Statistics from clist for CBO_TEST.TEST_STATS:
DBMS_STATS: ====================================================================================================
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ID 4,979802 1000000 0 1000000 1000000 1000000 ,000001 0 0 0
DBMS_STATS: min: Typ=2 Len=2: c1,2
DBMS_STATS: max: Typ=2 Len=2: c4,2
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ALMOST_NULL 1,000006 3 999997 3 1 1 ,16666666 1030 1 1
DBMS_STATS: min: Typ=2 Len=2: c1,2
DBMS_STATS: max: Typ=2 Len=2: c1,2
DBMS_STATS: Histograms:
DBMS_STATS: ---------------------------------------------------------------------------------------------------
DBMS_STATS: BVAL RPCNT EAVAL ENVAL EDVAL
DBMS_STATS: ---------------------------------------------------------------------------------------------------
DBMS_STATS: 3 C102 1 Typ=2 Len=2: c1,2
DBMS_STATS: Need Actual Values (DSC_EAVS)
DBMS_STATS: Histogram Type: FREQUENCY Data Type: 2
DBMS_STATS: Histogram Flags: 4100 Histogram Gathering Flags: 1030
DBMS_STATS: Incremental: FALSE Fix Control 13583722: 1
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: Number of rows in the table = 1000000, blocks = , average row length = 6, chain count = , scan rate = 0, sample size = 1000000, cstats.count = 3, cind = 3
DBMS_STATS: prepare reporting structures...
DBMS_STATS: gather_ext_col_stats 1: tbl_owner=CBO_TEST tbl_name=TEST_STATS col_name=ID part_name= granularity=GLOBAL AND PARTITION
DBMS_STATS: gather_ext_col_stats 1: tbl_owner=CBO_TEST tbl_name=TEST_STATS col_name=ALMOST_NULL part_name= granularity=GLOBAL AND PARTITION
DBMS_STATS: End of construct analyze using sql.
DBMS_STATS: Finished table CBO_TEST.TEST_STATS. at 19.08.19 17:07:43,761000000 +02:00
DBMS_STATS: reporting_man_update_task: objn: 72916 auto_stats: FALSE status: COMPLETED ctx.batching_coeff: 0

PL/SQL-Prozedur erfolgreich abgeschlossen.

SQL>
SQL> -- Testcase 2: Single distinct value, but only three rows are NULL
SQL> -- This scenario triggers a separate query to generate a histogram but using the usual aggressive sampling (5500 rows)
SQL> drop table test_stats purge;

Tabelle wurde geloscht.

SQL>
SQL> create table test_stats
2 as
3 select /*+ NO_GATHER_OPTIMIZER_STATISTICS */
4 rownum as id,
5 case when rownum > 3 then 1 else null end as almost_null
6 from
7 (select /*+ cardinality(1000) */ null from dual connect by level <= 1000) a,
8 (select /*+ cardinality(1000) */ null from dual connect by level <= 1000) b
9 ;

Tabelle wurde erstellt.

SQL>
SQL> exec dbms_stats.gather_table_stats(null, 'test_stats', method_opt => 'for columns size 1 id for columns size 254 almost_null')
DBMS_STATS: Record gather table stats operation on table : test_stats
DBMS_STATS: job name:
DBMS_STATS: |--> Operation id: 224
DBMS_STATS: gather_table_stats:
DBMS_STATS: Start gather table stats -- tabname: test_stats
DBMS_STATS: job name:
DBMS_STATS: Started table CBO_TEST.TEST_STATS. at 19.08.19 17:07:44,511000000 +02:00. Execution phase: SERIAL (NON-CONCURRENT) stattype: DATA Reporting Mode: FALSE
DBMS_STATS: Specified granularity = AUTO, Fixed granularity = GLOBAL AND PARTITION
DBMS_STATS: parameters ** pfix: (parent pfix: ) ownname: tabname: test_stats partname: estimate_percent: 101 block_sample: FALSE method_opt: for columns size 1 id for columns size 254 almost_null degree: 32766 granularity: Z gIdxGran: cascade: NULL stattab: statid: statown: no_invalidate: NULL flush_colu: TRUE fxt: stattype: DATA start_time: 08-19-2019 17:07:44 gathering_group_stats: FALSE cms_only: FALSE force: FALSE options: Z executionPhase: SERIAL (NON-CONCURRENT) incremental: FALSE concurrent: FALSE loc_degree: loc_method_opt: for columns size 1 id for columns size 254 almost_null fxt_typ: 0 tobjn: 72920
DBMS_STATS: Preferences for table CBO_TEST.TEST_STATS
DBMS_STATS: ================================================================================
DBMS_STATS: SKIP_TIME -
DBMS_STATS: STATS_RETENTION -
DBMS_STATS: MON_MODS_ALL_UPD_TIME -
DBMS_STATS: SNAPSHOT_UPD_TIME -
DBMS_STATS: TRACE - 0
DBMS_STATS: DEBUG - 0
DBMS_STATS: SYS_FLAGS - 0
DBMS_STATS: SPD_RETENTION_WEEKS - 53
DBMS_STATS: CASCADE - DBMS_STATS.AUTO_CASCADE
DBMS_STATS: ESTIMATE_PERCENT - DBMS_STATS.AUTO_SAMPLE_SIZE
DBMS_STATS: DEGREE - NULL
DBMS_STATS: METHOD_OPT - FOR ALL COLUMNS SIZE AUTO
DBMS_STATS: NO_INVALIDATE - DBMS_STATS.AUTO_INVALIDATE
DBMS_STATS: GRANULARITY - AUTO
DBMS_STATS: PUBLISH - TRUE
DBMS_STATS: STALE_PERCENT - 10
DBMS_STATS: APPROXIMATE_NDV - TRUE
DBMS_STATS: APPROXIMATE_NDV_ALGORITHM - REPEAT OR HYPERLOGLOG
DBMS_STATS: ANDV_ALGO_INTERNAL_OBSERVE - FALSE
DBMS_STATS: INCREMENTAL - FALSE
DBMS_STATS: INCREMENTAL_INTERNAL_CONTROL - TRUE
DBMS_STATS: AUTOSTATS_TARGET - AUTO
DBMS_STATS: CONCURRENT - OFF
DBMS_STATS: JOB_OVERHEAD_PERC - 1
DBMS_STATS: JOB_OVERHEAD - -1
DBMS_STATS: GLOBAL_TEMP_TABLE_STATS - SESSION
DBMS_STATS: ENABLE_TOP_FREQ_HISTOGRAMS - 3
DBMS_STATS: ENABLE_HYBRID_HISTOGRAMS - 3
DBMS_STATS: TABLE_CACHED_BLOCKS - 1
DBMS_STATS: INCREMENTAL_LEVEL - PARTITION
DBMS_STATS: INCREMENTAL_STALENESS - ALLOW_MIXED_FORMAT
DBMS_STATS: OPTIONS - GATHER
DBMS_STATS: GATHER_AUTO - AFTER_LOAD
DBMS_STATS: STAT_CATEGORY - OBJECT_STATS, REALTIME_STATS
DBMS_STATS: SCAN_RATE - 0
DBMS_STATS: GATHER_SCAN_RATE - HADOOP_ONLY
DBMS_STATS: PREFERENCE_OVERRIDES_PARAMETER - FALSE
DBMS_STATS: AUTO_STAT_EXTENSIONS - OFF
DBMS_STATS: WAIT_TIME_TO_UPDATE_STATS - 15
DBMS_STATS: ROOT_TRIGGER_PDB - FALSE
DBMS_STATS: COORDINATOR_TRIGGER_SHARD - FALSE
DBMS_STATS: MAINTAIN_STATISTICS_STATUS - FALSE
DBMS_STATS: AUTO_TASK_STATUS - OFF
DBMS_STATS: AUTO_TASK_MAX_RUN_TIME - 3600
DBMS_STATS: AUTO_TASK_INTERVAL - 900
DBMS_STATS: reporting_man_log_task: target: "CBO_TEST"."TEST_STATS" objn: 72920 auto_stats: FALSE status: IN PROGRESS ctx.batching_coeff: 0
DBMS_STATS: Start construct analyze using SQL .. Execution Phase: SERIAL (NON-CONCURRENT) granularity: GLOBAL AND PARTITION global_requested: NULL pfix:
DBMS_STATS: (Baseline)
DBMS_STATS: STATISTIC LAST_QUERY SESSION_TOTAL
DBMS_STATS: ------------------------------ ---------- -------------
DBMS_STATS: Queries Parallelized 0 0
DBMS_STATS: DML Parallelized 0 0
DBMS_STATS: DDL Parallelized 0 0
DBMS_STATS: DFO Trees 0 0
DBMS_STATS: Server Threads 0 0
DBMS_STATS: Allocation Height 0 0
DBMS_STATS: Allocation Width 0 0
DBMS_STATS: Local Msgs Sent 0 0
DBMS_STATS: Distr Msgs Sent 0 0
DBMS_STATS: Local Msgs Recv'd 0 0
DBMS_STATS: Distr Msgs Recv'd 0 0
DBMS_STATS: DOP 0 0
DBMS_STATS: Slave Sets 0 0
DBMS_STATS: Start gather part -- ctx.conc_ctx.global_requested: NULL gran: GLOBAL AND PARTITION execution phase: 1
DBMS_STATS: APPROX_NDV_ALGORITHM chosen: AS in non-incremental
DBMS_STATS: Start gather_stats.. pfix: ownname: CBO_TEST tabname: TEST_STATS pname: spname: execution phase: 1
Specified DOP=1 blocks=1794 DOP used=1
Specified DOP=1 blocks=1794 DOP used=1
DBMS_STATS: Using approximate NDV pct=0
DBMS_STATS: NNV NDV AVG MMX HST EP RP NNNP IND CNDV HSTN HSTR COLNAME
DBMS_STATS: Y Y Y Y Y ID
DBMS_STATS: Y Y Y Y Y Y Y Y ALMOST_NULL
DBMS_STATS: APPROX_NDV_ALGORITHM chosen: AS in non-incremental
DBMS_STATS: Approximate NDV Options
DBMS_STATS: NDV,NIL,NIL,TOPN,NIL,NIL,RWID,U1,U254U
DBMS_STATS: Starting query at 19.08.19 17:07:44,527000000 +02:00
DBMS_STATS: select /*+ full(t) no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring xmlindex_sel_idx_tbl opt_param('optimizer_inmemory_aware' 'false') no_substrb_pad */to_char(count("ID")),substrb(dump(min("ID"),16,0,64),1,240),substrb(dump(max("ID"),16,0,64),1,240),to_char(count("ALMOST_NULL")),substrb(dump(min("ALMOST_NULL"),16,0,64),1,240),substrb(dump(max("ALMOST_NULL"),16,0,64),1,240),count(rowidtochar(rowid)) from "CBO_TEST"."TEST_STATS" t /* NDV,NIL,NIL,TOPN,NIL,NIL,RWID,U1,U254U*/
DBMS_STATS: Ending query at 19.08.19 17:07:44,730000000 +02:00
DBMS_STATS: STATISTIC LAST_QUERY SESSION_TOTAL
DBMS_STATS: ------------------------------ ---------- -------------
DBMS_STATS: Queries Parallelized 0 0
DBMS_STATS: DML Parallelized 0 0
DBMS_STATS: DDL Parallelized 0 0
DBMS_STATS: DFO Trees 0 0
DBMS_STATS: Server Threads 0 0
DBMS_STATS: Allocation Height 0 0
DBMS_STATS: Allocation Width 0 0
DBMS_STATS: Local Msgs Sent 0 0
DBMS_STATS: Distr Msgs Sent 0 0
DBMS_STATS: Local Msgs Recv'd 0 0
DBMS_STATS: Distr Msgs Recv'd 0 0
DBMS_STATS: DOP 0 0
DBMS_STATS: Slave Sets 0 0
DBMS_STATS: Approximate NDV Result
DBMS_STATS:

0
1000000
1000000
6
1001984
1000000
3979802


1
Typ=2 Len=2: c1,2


2
Typ=2 Len=2: c4,2


3
999997
0
1
999997
1999994
1
AAARzYAAPAAAACBAAD,999997,


4
Typ=2 Len=2: c1,2



5
Typ=2 Len=2: c1,2



DBMS_STATS: process_raw_crec() for 1
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ID 3979802 1000000 NULL NULL NULL 1001984 NULL 0 0 0
DBMS_STATS: min:
DBMS_STATS: max:
DBMS_STATS: process_raw_crec() for 2
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ALMOST_NULL NULL NULL NULL NULL NULL NULL NULL 7 0 0
DBMS_STATS: min:
DBMS_STATS: max: Typ=2 Len=2: c1,2
DBMS_STATS: process_raw_crec() for 2
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ALMOST_NULL NULL NULL NULL NULL NULL NULL NULL 7 0 0
DBMS_STATS: min: Typ=2 Len=2: c1,2
DBMS_STATS: max: Typ=2 Len=2: c1,2
DBMS_STATS: process_raw_crec() for 1
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ID 3979802 1000000 NULL NULL NULL 1001984 NULL 0 0 0
DBMS_STATS: min: Typ=2 Len=2: c1,2
DBMS_STATS: max:
DBMS_STATS: process_raw_crec() for 1
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ID 3979802 1000000 NULL NULL NULL 1001984 NULL 0 0 0
DBMS_STATS: min: Typ=2 Len=2: c1,2
DBMS_STATS: max: Typ=2 Len=2: c4,2
DBMS_STATS: process_raw_crec() for 2
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ALMOST_NULL 1999994 999997 NULL NULL NULL 1 NULL 7 0 0
DBMS_STATS: min: Typ=2 Len=2: c1,2
DBMS_STATS: max: Typ=2 Len=2: c1,2
DBMS_STATS: start processing top n values for column "ALMOST_NULL"
DBMS_STATS: >> frequency histograms is not feasible
(dbms_lob.getlength(topn_values) <=
length(
)), skip!
DBMS_STATS: no histogram: setting density to 1/ndv (,000001)
DBMS_STATS: no histogram: setting density to 1/ndv (1)
DBMS_STATS: Iteration: 1 numhist: 1
DBMS_STATS: Iteration 1, percentage ,5500016500049500148500445501336504009512 nblks: 1794
DBMS_STATS: NNV NDV AVG MMX HST EP RP NNNP IND CNDV HSTN HSTR COLNAME
DBMS_STATS: Y ID
DBMS_STATS: Y Y Y Y ALMOST_NULL
Specified DOP=1 blocks=1794 DOP used=1
DBMS_STATS: Building Histogram for ALMOST_NULL
DBMS_STATS: bktnum=254, nnv=999997, snnv=5500, sndv=1, est_ndv=1, mnb=254
DBMS_STATS: Trying frequency histogram
DBMS_STATS: Starting query at 19.08.19 17:07:44,730000000 +02:00
DBMS_STATS: select substrb(dump(val,16,0,64),1,240) ep, cnt from (select /*+ no_expand_table(t) index_rs(t) no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring xmlindex_sel_idx_tbl opt_param('optimizer_inmemory_aware' 'false') no_substrb_pad */"ALMOST_NULL" val,count(*) cnt from "CBO_TEST"."TEST_STATS" sample ( .5500016500) t where "ALMOST_NULL" is not null group by "ALMOST_NULL") order by val
DBMS_STATS: Evaluating frequency histogram for col: "ALMOST_NULL"
DBMS_STATS: number of values = 1, max # of buckects = 254, pct = ,5500016500049500148500445501336504009512, ssize = 5499
DBMS_STATS: csr.hreq: 1 Histogram gathering flags: 1031
DBMS_STATS: Start fill_cstats - hybrid_enabled: TRUE
DBMS_STATS: ====================================================================================================
DBMS_STATS: Statistics from clist for CBO_TEST.TEST_STATS:
DBMS_STATS: ====================================================================================================
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ID 4,979802 1000000 0 1000000 1000000 1000000 ,000001 0 0 0
DBMS_STATS: min: Typ=2 Len=2: c1,2
DBMS_STATS: max: Typ=2 Len=2: c4,2
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ALMOST_NULL 2,999994 999997 3 5499 1 1 ,00000050 1030 1 1
DBMS_STATS: min: Typ=2 Len=2: c1,2
DBMS_STATS: max: Typ=2 Len=2: c1,2
DBMS_STATS: Histograms:
DBMS_STATS: ---------------------------------------------------------------------------------------------------
DBMS_STATS: BVAL RPCNT EAVAL ENVAL EDVAL
DBMS_STATS: ---------------------------------------------------------------------------------------------------
DBMS_STATS: 5499 C102 1 Typ=2 Len=2: c1,2
DBMS_STATS: Need Actual Values (DSC_EAVS)
DBMS_STATS: Histogram Type: FREQUENCY Data Type: 2
DBMS_STATS: Histogram Flags: 4100 Histogram Gathering Flags: 1030
DBMS_STATS: Incremental: FALSE Fix Control 13583722: 1
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: Number of rows in the table = 1000000, blocks = , average row length = 8, chain count = , scan rate = 0, sample size = 1000000, cstats.count = 3, cind = 3
DBMS_STATS: prepare reporting structures...
DBMS_STATS: gather_ext_col_stats 1: tbl_owner=CBO_TEST tbl_name=TEST_STATS col_name=ID part_name= granularity=GLOBAL AND PARTITION
DBMS_STATS: gather_ext_col_stats 1: tbl_owner=CBO_TEST tbl_name=TEST_STATS col_name=ALMOST_NULL part_name= granularity=GLOBAL AND PARTITION
DBMS_STATS: End of construct analyze using sql.
DBMS_STATS: Finished table CBO_TEST.TEST_STATS. at 19.08.19 17:07:44,761000000 +02:00
DBMS_STATS: reporting_man_update_task: objn: 72920 auto_stats: FALSE status: COMPLETED ctx.batching_coeff: 0

PL/SQL-Prozedur erfolgreich abgeschlossen.

SQL>
SQL> -- Testcase 3: 11 distinct values, but only three rows are NULL
SQL> -- This scenario extracts the values directly from the ROWIDs returned from the main query, no separate query
SQL> drop table test_stats purge;

Tabelle wurde geloscht.

SQL>
SQL> create table test_stats
2 as
3 select /*+ NO_GATHER_OPTIMIZER_STATISTICS */
4 rownum as id,
5 case when rownum > 3 then trunc(rownum / 100000) else null end as almost_null
6 from
7 (select /*+ cardinality(1000) */ null from dual connect by level <= 1000) a,
8 (select /*+ cardinality(1000) */ null from dual connect by level <= 1000) b
9 ;

Tabelle wurde erstellt.

SQL>
SQL> exec dbms_stats.gather_table_stats(null, 'test_stats', method_opt => 'for columns size 1 id for columns size 254 almost_null')
DBMS_STATS: Record gather table stats operation on table : test_stats
DBMS_STATS: job name:
DBMS_STATS: |--> Operation id: 225
DBMS_STATS: gather_table_stats:
DBMS_STATS: Start gather table stats -- tabname: test_stats
DBMS_STATS: job name:
DBMS_STATS: Started table CBO_TEST.TEST_STATS. at 19.08.19 17:07:45,448000000 +02:00. Execution phase: SERIAL (NON-CONCURRENT) stattype: DATA Reporting Mode: FALSE
DBMS_STATS: Specified granularity = AUTO, Fixed granularity = GLOBAL AND PARTITION
DBMS_STATS: parameters ** pfix: (parent pfix: ) ownname: tabname: test_stats partname: estimate_percent: 101 block_sample: FALSE method_opt: for columns size 1 id for columns size 254 almost_null degree: 32766 granularity: Z gIdxGran: cascade: NULL stattab: statid: statown: no_invalidate: NULL flush_colu: TRUE fxt: stattype: DATA start_time: 08-19-2019 17:07:45 gathering_group_stats: FALSE cms_only: FALSE force: FALSE options: Z executionPhase: SERIAL (NON-CONCURRENT) incremental: FALSE concurrent: FALSE loc_degree: loc_method_opt: for columns size 1 id for columns size 254 almost_null fxt_typ: 0 tobjn: 72921
DBMS_STATS: Preferences for table CBO_TEST.TEST_STATS
DBMS_STATS: ================================================================================
DBMS_STATS: SKIP_TIME -
DBMS_STATS: STATS_RETENTION -
DBMS_STATS: MON_MODS_ALL_UPD_TIME -
DBMS_STATS: SNAPSHOT_UPD_TIME -
DBMS_STATS: TRACE - 0
DBMS_STATS: DEBUG - 0
DBMS_STATS: SYS_FLAGS - 0
DBMS_STATS: SPD_RETENTION_WEEKS - 53
DBMS_STATS: CASCADE - DBMS_STATS.AUTO_CASCADE
DBMS_STATS: ESTIMATE_PERCENT - DBMS_STATS.AUTO_SAMPLE_SIZE
DBMS_STATS: DEGREE - NULL
DBMS_STATS: METHOD_OPT - FOR ALL COLUMNS SIZE AUTO
DBMS_STATS: NO_INVALIDATE - DBMS_STATS.AUTO_INVALIDATE
DBMS_STATS: GRANULARITY - AUTO
DBMS_STATS: PUBLISH - TRUE
DBMS_STATS: STALE_PERCENT - 10
DBMS_STATS: APPROXIMATE_NDV - TRUE
DBMS_STATS: APPROXIMATE_NDV_ALGORITHM - REPEAT OR HYPERLOGLOG
DBMS_STATS: ANDV_ALGO_INTERNAL_OBSERVE - FALSE
DBMS_STATS: INCREMENTAL - FALSE
DBMS_STATS: INCREMENTAL_INTERNAL_CONTROL - TRUE
DBMS_STATS: AUTOSTATS_TARGET - AUTO
DBMS_STATS: CONCURRENT - OFF
DBMS_STATS: JOB_OVERHEAD_PERC - 1
DBMS_STATS: JOB_OVERHEAD - -1
DBMS_STATS: GLOBAL_TEMP_TABLE_STATS - SESSION
DBMS_STATS: ENABLE_TOP_FREQ_HISTOGRAMS - 3
DBMS_STATS: ENABLE_HYBRID_HISTOGRAMS - 3
DBMS_STATS: TABLE_CACHED_BLOCKS - 1
DBMS_STATS: INCREMENTAL_LEVEL - PARTITION
DBMS_STATS: INCREMENTAL_STALENESS - ALLOW_MIXED_FORMAT
DBMS_STATS: OPTIONS - GATHER
DBMS_STATS: GATHER_AUTO - AFTER_LOAD
DBMS_STATS: STAT_CATEGORY - OBJECT_STATS, REALTIME_STATS
DBMS_STATS: SCAN_RATE - 0
DBMS_STATS: GATHER_SCAN_RATE - HADOOP_ONLY
DBMS_STATS: PREFERENCE_OVERRIDES_PARAMETER - FALSE
DBMS_STATS: AUTO_STAT_EXTENSIONS - OFF
DBMS_STATS: WAIT_TIME_TO_UPDATE_STATS - 15
DBMS_STATS: ROOT_TRIGGER_PDB - FALSE
DBMS_STATS: COORDINATOR_TRIGGER_SHARD - FALSE
DBMS_STATS: MAINTAIN_STATISTICS_STATUS - FALSE
DBMS_STATS: AUTO_TASK_STATUS - OFF
DBMS_STATS: AUTO_TASK_MAX_RUN_TIME - 3600
DBMS_STATS: AUTO_TASK_INTERVAL - 900
DBMS_STATS: reporting_man_log_task: target: "CBO_TEST"."TEST_STATS" objn: 72921 auto_stats: FALSE status: IN PROGRESS ctx.batching_coeff: 0
DBMS_STATS: Start construct analyze using SQL .. Execution Phase: SERIAL (NON-CONCURRENT) granularity: GLOBAL AND PARTITION global_requested: NULL pfix:
DBMS_STATS: (Baseline)
DBMS_STATS: STATISTIC LAST_QUERY SESSION_TOTAL
DBMS_STATS: ------------------------------ ---------- -------------
DBMS_STATS: Queries Parallelized 0 0
DBMS_STATS: DML Parallelized 0 0
DBMS_STATS: DDL Parallelized 0 0
DBMS_STATS: DFO Trees 0 0
DBMS_STATS: Server Threads 0 0
DBMS_STATS: Allocation Height 0 0
DBMS_STATS: Allocation Width 0 0
DBMS_STATS: Local Msgs Sent 0 0
DBMS_STATS: Distr Msgs Sent 0 0
DBMS_STATS: Local Msgs Recv'd 0 0
DBMS_STATS: Distr Msgs Recv'd 0 0
DBMS_STATS: DOP 0 0
DBMS_STATS: Slave Sets 0 0
DBMS_STATS: Start gather part -- ctx.conc_ctx.global_requested: NULL gran: GLOBAL AND PARTITION execution phase: 1
DBMS_STATS: APPROX_NDV_ALGORITHM chosen: AS in non-incremental
DBMS_STATS: Start gather_stats.. pfix: ownname: CBO_TEST tabname: TEST_STATS pname: spname: execution phase: 1
Specified DOP=1 blocks=1781 DOP used=1
Specified DOP=1 blocks=1781 DOP used=1
DBMS_STATS: Using approximate NDV pct=0
DBMS_STATS: NNV NDV AVG MMX HST EP RP NNNP IND CNDV HSTN HSTR COLNAME
DBMS_STATS: Y Y Y Y Y ID
DBMS_STATS: Y Y Y Y Y Y Y Y ALMOST_NULL
DBMS_STATS: APPROX_NDV_ALGORITHM chosen: AS in non-incremental
DBMS_STATS: Approximate NDV Options
DBMS_STATS: NDV,NIL,NIL,TOPN,NIL,NIL,RWID,U1,U254U
DBMS_STATS: Starting query at 19.08.19 17:07:45,448000000 +02:00
DBMS_STATS: select /*+ full(t) no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring xmlindex_sel_idx_tbl opt_param('optimizer_inmemory_aware' 'false') no_substrb_pad */to_char(count("ID")),substrb(dump(min("ID"),16,0,64),1,240),substrb(dump(max("ID"),16,0,64),1,240),to_char(count("ALMOST_NULL")),substrb(dump(min("ALMOST_NULL"),16,0,64),1,240),substrb(dump(max("ALMOST_NULL"),16,0,64),1,240),count(rowidtochar(rowid)) from "CBO_TEST"."TEST_STATS" t /* NDV,NIL,NIL,TOPN,NIL,NIL,RWID,U1,U254U*/
DBMS_STATS: Ending query at 19.08.19 17:07:45,651000000 +02:00
DBMS_STATS: STATISTIC LAST_QUERY SESSION_TOTAL
DBMS_STATS: ------------------------------ ---------- -------------
DBMS_STATS: Queries Parallelized 0 0
DBMS_STATS: DML Parallelized 0 0
DBMS_STATS: DDL Parallelized 0 0
DBMS_STATS: DFO Trees 0 0
DBMS_STATS: Server Threads 0 0
DBMS_STATS: Allocation Height 0 0
DBMS_STATS: Allocation Width 0 0
DBMS_STATS: Local Msgs Sent 0 0
DBMS_STATS: Distr Msgs Sent 0 0
DBMS_STATS: Local Msgs Recv'd 0 0
DBMS_STATS: Distr Msgs Recv'd 0 0
DBMS_STATS: DOP 0 0
DBMS_STATS: Slave Sets 0 0
DBMS_STATS: Approximate NDV Result
DBMS_STATS:

0
1000000
1000000
6
1001984
1000000
3979802


1
Typ=2 Len=2: c1,2


2
Typ=2 Len=2: c4,2


3
999997
0
11
999997
1899998
11

AAARzZAAPAAAAVaAFp,100000,AAARzZAAPAAAANAAAb,100000,AAARzZAAPAAAAKMAEf,100000,AAARzZAAPAAAAHYAIj,100000,AAARzZAAPAAAASnABA,100000,AAARzZAAPAAAAbBAGO,100000,AAARzZAAPAAAAYOABl,100000,AAARzZAAPAAAAElAEP,100000,AAARzZAAPAAAAPzAFE,100000,AAARzZAAPAAAACBAAD,99996,AAARzZAAPAAAAd1ACK,1,


4
Typ=2 Len=1: 80


5
Typ=2 Len=2: c1,b



DBMS_STATS: process_raw_crec() for 1
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ID 3979802 1000000 NULL NULL NULL 1001984 NULL 0 0 0
DBMS_STATS: min:
DBMS_STATS: max:
DBMS_STATS: process_raw_crec() for 2
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ALMOST_NULL NULL NULL NULL NULL NULL NULL NULL 7 0 0
DBMS_STATS: min:
DBMS_STATS: max: Typ=2 Len=2: c1,b
DBMS_STATS: process_raw_crec() for 2
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ALMOST_NULL NULL NULL NULL NULL NULL NULL NULL 7 0 0
DBMS_STATS: min: Typ=2 Len=1: 80
DBMS_STATS: max: Typ=2 Len=2: c1,b
DBMS_STATS: process_raw_crec() for 1
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ID 3979802 1000000 NULL NULL NULL 1001984 NULL 0 0 0
DBMS_STATS: min: Typ=2 Len=2: c1,2
DBMS_STATS: max:
DBMS_STATS: process_raw_crec() for 1
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ID 3979802 1000000 NULL NULL NULL 1001984 NULL 0 0 0
DBMS_STATS: min: Typ=2 Len=2: c1,2
DBMS_STATS: max: Typ=2 Len=2: c4,2
DBMS_STATS: process_raw_crec() for 2
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ALMOST_NULL 1899998 999997 NULL NULL NULL 11 NULL 7 0 0
DBMS_STATS: min: Typ=2 Len=1: 80
DBMS_STATS: max: Typ=2 Len=2: c1,b
DBMS_STATS: start processing top n values for column "ALMOST_NULL"
DBMS_STATS: Parsing topn values..
DBMS_STATS: Extracted 11 rowid-freq pairs.
DBMS_STATS: topn sql (len: 744):
DBMS_STATS: +++ select /*+ no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring xmlindex_sel_idx_tbl opt_param('optimizer_inmemory_aware' 'false') no_substrb_pad */ substrb(dump("ALMOST_NULL",16,0,64),1,240) d_1773304655, rowidtochar(rowid) rwid from "CBO_TEST"."TEST_STATS" t where rowid in (chartorowid('AAARzZAAPAAAACBAAD'),chartorowid('AAARzZAAPAAAAElAEP'),chartorowid('AAARzZAAPAAAAHYAIj'),chartorowid('AAARzZAAPAAAAKMAEf'),chartorowid('AAARzZAAPAAAANAAAb'),chartorowid('AAARzZAAPAAAAPzAFE'),chartorowid('AAARzZAAPAAAASnABA'),chartorowid('AAARzZAAPAAAAVaAFp'),chartorowid('AAARzZAAPAAAAYOABl'),chartorowid('AAARzZAAPAAAAbBAGO'),chartorowid('AAARzZAAPAAAAd1ACK')) order by "ALMOST_NULL"
DBMS_STATS: start executing top n value query..
DBMS_STATS: Starting query at 19.08.19 17:07:45,667000000 +02:00
DBMS_STATS: select /*+ no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring xmlindex_sel_idx_tbl opt_param('optimizer_inmemory_aware' 'false') no_substrb_pad */ substrb(dump("ALMOST_NULL",16,0,64),1,240) d_1773304655, rowidtochar(rowid) rwid from "CBO_TEST"."TEST_STATS" t where rowid in (chartorowid('AAARzZAAPAAAACBAAD'),chartorowid('AAARzZAAPAAAAElAEP'),chartorowid('AAARzZAAPAAAAHYAIj'),chartorowid('AAARzZAAPAAAAKMAEf'),chartorowid('AAARzZAAPAAAANAAAb'),chartorowid('AAARzZAAPAAAAPzAFE'),chartorowid('AAARzZAAPAAAASnABA'),chartorowid('AAARzZAAPAAAAVaAFp'),chartorowid('AAARzZAAPAAAAYOABl'),chartorowid('AAARzZAAPAAAAbBAGO'),chartorowid('AAARzZAAPAAAAd1ACK')) order by "ALMOST_NULL"
DBMS_STATS: removal_count: -243 total_nonnull_rows: 999997 mnb: 254
DBMS_STATS: hist_type in exec_get_topn: 1024 ndv:11 mnb:254
DBMS_STATS: Evaluating frequency histogram for col: "ALMOST_NULL"
DBMS_STATS: number of values = 11, max # of buckects = 254, pct = 100, ssize = 999997
DBMS_STATS: csr.hreq: 1 Histogram gathering flags: 1031
DBMS_STATS: done_hist in process_topn: TRUE csr.ccnt: 1
DBMS_STATS: Mark column "ALMOST_NULL" as top N computed
DBMS_STATS: no histogram: setting density to 1/ndv (,000001)
DBMS_STATS: Skip topn computed column "ALMOST_NULL" numhist: 0
DBMS_STATS: Start fill_cstats - hybrid_enabled: TRUE
DBMS_STATS: ====================================================================================================
DBMS_STATS: Statistics from clist for CBO_TEST.TEST_STATS:
DBMS_STATS: ====================================================================================================
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ID 4,979802 1000000 0 1000000 1000000 1000000 ,000001 0 0 0
DBMS_STATS: min: Typ=2 Len=2: c1,2
DBMS_STATS: max: Typ=2 Len=2: c4,2
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ALMOST_NULL 2,899998 999997 3 999997 11 11 ,00000050 1286 1 1
DBMS_STATS: min: Typ=2 Len=1: 80
DBMS_STATS: max: Typ=2 Len=2: c1,b
DBMS_STATS: Histograms:
DBMS_STATS: ---------------------------------------------------------------------------------------------------
DBMS_STATS: BVAL RPCNT EAVAL ENVAL EDVAL
DBMS_STATS: ---------------------------------------------------------------------------------------------------
DBMS_STATS: 99996 80 0 Typ=2 Len=1: 80
DBMS_STATS: 199996 C102 1 Typ=2 Len=2: c1,2
DBMS_STATS: 299996 C103 2 Typ=2 Len=2: c1,3
DBMS_STATS: 399996 C104 3 Typ=2 Len=2: c1,4
DBMS_STATS: 499996 C105 4 Typ=2 Len=2: c1,5
DBMS_STATS: 599996 C106 5 Typ=2 Len=2: c1,6
DBMS_STATS: 699996 C107 6 Typ=2 Len=2: c1,7
DBMS_STATS: 799996 C108 7 Typ=2 Len=2: c1,8
DBMS_STATS: 899996 C109 8 Typ=2 Len=2: c1,9
DBMS_STATS: 999996 C10A 9 Typ=2 Len=2: c1,a
DBMS_STATS: 999997 C10B 10 Typ=2 Len=2: c1,b
DBMS_STATS: Need Actual Values (DSC_EAVS)
DBMS_STATS: Histogram Type: FREQUENCY Data Type: 2
DBMS_STATS: Histogram Flags: 4100 Histogram Gathering Flags: 1286
DBMS_STATS: Incremental: FALSE Fix Control 13583722: 1
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: Number of rows in the table = 1000000, blocks = , average row length = 8, chain count = , scan rate = 0, sample size = 1000000, cstats.count = 13, cind = 13
DBMS_STATS: prepare reporting structures...
DBMS_STATS: gather_ext_col_stats 1: tbl_owner=CBO_TEST tbl_name=TEST_STATS col_name=ID part_name= granularity=GLOBAL AND PARTITION
DBMS_STATS: gather_ext_col_stats 1: tbl_owner=CBO_TEST tbl_name=TEST_STATS col_name=ALMOST_NULL part_name= granularity=GLOBAL AND PARTITION
DBMS_STATS: End of construct analyze using sql.
DBMS_STATS: Finished table CBO_TEST.TEST_STATS. at 19.08.19 17:07:45,683000000 +02:00
DBMS_STATS: reporting_man_update_task: objn: 72921 auto_stats: FALSE status: COMPLETED ctx.batching_coeff: 0

PL/SQL-Prozedur erfolgreich abgeschlossen.

SQL>
SQL> -- Testcase 4: Three distinct values in three rows and all other rows have NULL
SQL> -- This scenario extracts the values directly from the ROWIDs returned from the main query, no separate query
SQL> -- Applies to two distinct values, too
SQL> -- So a single distinct value looks like a special case that triggers a separate query
SQL> -- If this is with combination of almost all rows having NULLs this query doesn't use sampling
SQL> -- ! Big threat if the table is big !
SQL> drop table test_stats purge;

Tabelle wurde geloscht.

SQL>
SQL> create table test_stats
2 as
3 select /*+ NO_GATHER_OPTIMIZER_STATISTICS */
4 rownum as id,
5 case when rownum <= 2 then rownum else null end as almost_null
6 from
7 (select /*+ cardinality(1000) */ null from dual connect by level <= 1000) a,
8 (select /*+ cardinality(1000) */ null from dual connect by level <= 1000) b
9 ;

Tabelle wurde erstellt.

SQL>
SQL> exec dbms_stats.gather_table_stats(null, 'test_stats', method_opt => 'for columns size 1 id for columns size 254 almost_null')
DBMS_STATS: Record gather table stats operation on table : test_stats
DBMS_STATS: job name:
DBMS_STATS: |--> Operation id: 226
DBMS_STATS: gather_table_stats:
DBMS_STATS: Start gather table stats -- tabname: test_stats
DBMS_STATS: job name:
DBMS_STATS: Started table CBO_TEST.TEST_STATS. at 19.08.19 17:07:46,276000000 +02:00. Execution phase: SERIAL (NON-CONCURRENT) stattype: DATA Reporting Mode: FALSE
DBMS_STATS: Specified granularity = AUTO, Fixed granularity = GLOBAL AND PARTITION
DBMS_STATS: parameters ** pfix: (parent pfix: ) ownname: tabname: test_stats partname: estimate_percent: 101 block_sample: FALSE method_opt: for columns size 1 id for columns size 254 almost_null degree: 32766 granularity: Z gIdxGran: cascade: NULL stattab: statid: statown: no_invalidate: NULL flush_colu: TRUE fxt: stattype: DATA start_time: 08-19-2019 17:07:46 gathering_group_stats: FALSE cms_only: FALSE force: FALSE options: Z executionPhase: SERIAL (NON-CONCURRENT) incremental: FALSE concurrent: FALSE loc_degree: loc_method_opt: for columns size 1 id for columns size 254 almost_null fxt_typ: 0 tobjn: 72922
DBMS_STATS: Preferences for table CBO_TEST.TEST_STATS
DBMS_STATS: ================================================================================
DBMS_STATS: SKIP_TIME -
DBMS_STATS: STATS_RETENTION -
DBMS_STATS: MON_MODS_ALL_UPD_TIME -
DBMS_STATS: SNAPSHOT_UPD_TIME -
DBMS_STATS: TRACE - 0
DBMS_STATS: DEBUG - 0
DBMS_STATS: SYS_FLAGS - 0
DBMS_STATS: SPD_RETENTION_WEEKS - 53
DBMS_STATS: CASCADE - DBMS_STATS.AUTO_CASCADE
DBMS_STATS: ESTIMATE_PERCENT - DBMS_STATS.AUTO_SAMPLE_SIZE
DBMS_STATS: DEGREE - NULL
DBMS_STATS: METHOD_OPT - FOR ALL COLUMNS SIZE AUTO
DBMS_STATS: NO_INVALIDATE - DBMS_STATS.AUTO_INVALIDATE
DBMS_STATS: GRANULARITY - AUTO
DBMS_STATS: PUBLISH - TRUE
DBMS_STATS: STALE_PERCENT - 10
DBMS_STATS: APPROXIMATE_NDV - TRUE
DBMS_STATS: APPROXIMATE_NDV_ALGORITHM - REPEAT OR HYPERLOGLOG
DBMS_STATS: ANDV_ALGO_INTERNAL_OBSERVE - FALSE
DBMS_STATS: INCREMENTAL - FALSE
DBMS_STATS: INCREMENTAL_INTERNAL_CONTROL - TRUE
DBMS_STATS: AUTOSTATS_TARGET - AUTO
DBMS_STATS: CONCURRENT - OFF
DBMS_STATS: JOB_OVERHEAD_PERC - 1
DBMS_STATS: JOB_OVERHEAD - -1
DBMS_STATS: GLOBAL_TEMP_TABLE_STATS - SESSION
DBMS_STATS: ENABLE_TOP_FREQ_HISTOGRAMS - 3
DBMS_STATS: ENABLE_HYBRID_HISTOGRAMS - 3
DBMS_STATS: TABLE_CACHED_BLOCKS - 1
DBMS_STATS: INCREMENTAL_LEVEL - PARTITION
DBMS_STATS: INCREMENTAL_STALENESS - ALLOW_MIXED_FORMAT
DBMS_STATS: OPTIONS - GATHER
DBMS_STATS: GATHER_AUTO - AFTER_LOAD
DBMS_STATS: STAT_CATEGORY - OBJECT_STATS, REALTIME_STATS
DBMS_STATS: SCAN_RATE - 0
DBMS_STATS: GATHER_SCAN_RATE - HADOOP_ONLY
DBMS_STATS: PREFERENCE_OVERRIDES_PARAMETER - FALSE
DBMS_STATS: AUTO_STAT_EXTENSIONS - OFF
DBMS_STATS: WAIT_TIME_TO_UPDATE_STATS - 15
DBMS_STATS: ROOT_TRIGGER_PDB - FALSE
DBMS_STATS: COORDINATOR_TRIGGER_SHARD - FALSE
DBMS_STATS: MAINTAIN_STATISTICS_STATUS - FALSE
DBMS_STATS: AUTO_TASK_STATUS - OFF
DBMS_STATS: AUTO_TASK_MAX_RUN_TIME - 3600
DBMS_STATS: AUTO_TASK_INTERVAL - 900
DBMS_STATS: reporting_man_log_task: target: "CBO_TEST"."TEST_STATS" objn: 72922 auto_stats: FALSE status: IN PROGRESS ctx.batching_coeff: 0
DBMS_STATS: Start construct analyze using SQL .. Execution Phase: SERIAL (NON-CONCURRENT) granularity: GLOBAL AND PARTITION global_requested: NULL pfix:
DBMS_STATS: (Baseline)
DBMS_STATS: STATISTIC LAST_QUERY SESSION_TOTAL
DBMS_STATS: ------------------------------ ---------- -------------
DBMS_STATS: Queries Parallelized 0 0
DBMS_STATS: DML Parallelized 0 0
DBMS_STATS: DDL Parallelized 0 0
DBMS_STATS: DFO Trees 0 0
DBMS_STATS: Server Threads 0 0
DBMS_STATS: Allocation Height 0 0
DBMS_STATS: Allocation Width 0 0
DBMS_STATS: Local Msgs Sent 0 0
DBMS_STATS: Distr Msgs Sent 0 0
DBMS_STATS: Local Msgs Recv'd 0 0
DBMS_STATS: Distr Msgs Recv'd 0 0
DBMS_STATS: DOP 0 0
DBMS_STATS: Slave Sets 0 0
DBMS_STATS: Start gather part -- ctx.conc_ctx.global_requested: NULL gran: GLOBAL AND PARTITION execution phase: 1
DBMS_STATS: APPROX_NDV_ALGORITHM chosen: AS in non-incremental
DBMS_STATS: Start gather_stats.. pfix: ownname: CBO_TEST tabname: TEST_STATS pname: spname: execution phase: 1
Specified DOP=1 blocks=1520 DOP used=1
Specified DOP=1 blocks=1520 DOP used=1
DBMS_STATS: Using approximate NDV pct=0
DBMS_STATS: NNV NDV AVG MMX HST EP RP NNNP IND CNDV HSTN HSTR COLNAME
DBMS_STATS: Y Y Y Y Y ID
DBMS_STATS: Y Y Y Y Y Y Y Y ALMOST_NULL
DBMS_STATS: APPROX_NDV_ALGORITHM chosen: AS in non-incremental
DBMS_STATS: Approximate NDV Options
DBMS_STATS: NDV,NIL,NIL,TOPN,NIL,NIL,RWID,U1,U254U
DBMS_STATS: Starting query at 19.08.19 17:07:46,292000000 +02:00
DBMS_STATS: select /*+ full(t) no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring xmlindex_sel_idx_tbl opt_param('optimizer_inmemory_aware' 'false') no_substrb_pad */to_char(count("ID")),substrb(dump(min("ID"),16,0,64),1,240),substrb(dump(max("ID"),16,0,64),1,240),to_char(count("ALMOST_NULL")),substrb(dump(min("ALMOST_NULL"),16,0,64),1,240),substrb(dump(max("ALMOST_NULL"),16,0,64),1,240),count(rowidtochar(rowid)) from "CBO_TEST"."TEST_STATS" t /* NDV,NIL,NIL,TOPN,NIL,NIL,RWID,U1,U254U*/
DBMS_STATS: Ending query at 19.08.19 17:07:46,448000000 +02:00
DBMS_STATS: STATISTIC LAST_QUERY SESSION_TOTAL
DBMS_STATS: ------------------------------ ---------- -------------
DBMS_STATS: Queries Parallelized 0 0
DBMS_STATS: DML Parallelized 0 0
DBMS_STATS: DDL Parallelized 0 0
DBMS_STATS: DFO Trees 0 0
DBMS_STATS: Server Threads 0 0
DBMS_STATS: Allocation Height 0 0
DBMS_STATS: Allocation Width 0 0
DBMS_STATS: Local Msgs Sent 0 0
DBMS_STATS: Distr Msgs Sent 0 0
DBMS_STATS: Local Msgs Recv'd 0 0
DBMS_STATS: Distr Msgs Recv'd 0 0
DBMS_STATS: DOP 0 0
DBMS_STATS: Slave Sets 0 0
DBMS_STATS: Approximate NDV Result
DBMS_STATS:

0
1000000
1000000
6
1001984
1000000
3979802


1
Typ=2 Len=2: c1,2


2
Typ=2 Len=2: c4,2


3
2
0
2
2
4
2
AAARzaAAPAAAACBAAB,1,AAARzaAAPAAAACBAAA,1,


4
Typ=2 Len=2: c1,2



5
Typ=2 Len=2: c1,3



DBMS_STATS: process_raw_crec() for 1
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ID 3979802 1000000 NULL NULL NULL 1001984 NULL 0 0 0
DBMS_STATS: min:
DBMS_STATS: max:
DBMS_STATS: process_raw_crec() for 2
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ALMOST_NULL NULL NULL NULL NULL NULL NULL NULL 7 0 0
DBMS_STATS: min:
DBMS_STATS: max: Typ=2 Len=2: c1,3
DBMS_STATS: process_raw_crec() for 2
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ALMOST_NULL NULL NULL NULL NULL NULL NULL NULL 7 0 0
DBMS_STATS: min: Typ=2 Len=2: c1,2
DBMS_STATS: max: Typ=2 Len=2: c1,3
DBMS_STATS: process_raw_crec() for 1
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ID 3979802 1000000 NULL NULL NULL 1001984 NULL 0 0 0
DBMS_STATS: min: Typ=2 Len=2: c1,2
DBMS_STATS: max:
DBMS_STATS: process_raw_crec() for 1
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ID 3979802 1000000 NULL NULL NULL 1001984 NULL 0 0 0
DBMS_STATS: min: Typ=2 Len=2: c1,2
DBMS_STATS: max: Typ=2 Len=2: c4,2
DBMS_STATS: process_raw_crec() for 2
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ALMOST_NULL 4 2 NULL NULL NULL 2 NULL 7 0 0
DBMS_STATS: min: Typ=2 Len=2: c1,2
DBMS_STATS: max: Typ=2 Len=2: c1,3
DBMS_STATS: start processing top n values for column "ALMOST_NULL"
DBMS_STATS: Parsing topn values..
DBMS_STATS: Extracted 2 rowid-freq pairs.
DBMS_STATS: topn sql (len: 438):
DBMS_STATS: +++ select /*+ no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring xmlindex_sel_idx_tbl opt_param('optimizer_inmemory_aware' 'false') no_substrb_pad */ substrb(dump("ALMOST_NULL",16,0,64),1,240) d_1773304655, rowidtochar(rowid) rwid from "CBO_TEST"."TEST_STATS" t where rowid in (chartorowid('AAARzaAAPAAAACBAAA'),chartorowid('AAARzaAAPAAAACBAAB')) order by "ALMOST_NULL"
DBMS_STATS: start executing top n value query..
DBMS_STATS: Starting query at 19.08.19 17:07:46,464000000 +02:00
DBMS_STATS: select /*+ no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring xmlindex_sel_idx_tbl opt_param('optimizer_inmemory_aware' 'false') no_substrb_pad */ substrb(dump("ALMOST_NULL",16,0,64),1,240) d_1773304655, rowidtochar(rowid) rwid from "CBO_TEST"."TEST_STATS" t where rowid in (chartorowid('AAARzaAAPAAAACBAAA'),chartorowid('AAARzaAAPAAAACBAAB')) order by "ALMOST_NULL"
DBMS_STATS: removal_count: -252 total_nonnull_rows: 2 mnb: 254
DBMS_STATS: hist_type in exec_get_topn: 1024 ndv:2 mnb:254
DBMS_STATS: Evaluating frequency histogram for col: "ALMOST_NULL"
DBMS_STATS: number of values = 2, max # of buckects = 254, pct = 100, ssize = 2
DBMS_STATS: csr.hreq: 1 Histogram gathering flags: 1031
DBMS_STATS: done_hist in process_topn: TRUE csr.ccnt: 1
DBMS_STATS: Mark column "ALMOST_NULL" as top N computed
DBMS_STATS: no histogram: setting density to 1/ndv (,000001)
DBMS_STATS: Skip topn computed column "ALMOST_NULL" numhist: 0
DBMS_STATS: Start fill_cstats - hybrid_enabled: TRUE
DBMS_STATS: ====================================================================================================
DBMS_STATS: Statistics from clist for CBO_TEST.TEST_STATS:
DBMS_STATS: ====================================================================================================
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ID 4,979802 1000000 0 1000000 1000000 1000000 ,000001 0 0 0
DBMS_STATS: min: Typ=2 Len=2: c1,2
DBMS_STATS: max: Typ=2 Len=2: c4,2
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: COL_NAME ACL SNNV NV SNNVDV NDV SNDV DENS HGATH HIND CCNT
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: ALMOST_NULL 1,000004 2 999998 2 2 2 ,25 1286 1 1
DBMS_STATS: min: Typ=2 Len=2: c1,2
DBMS_STATS: max: Typ=2 Len=2: c1,3
DBMS_STATS: Histograms:
DBMS_STATS: ---------------------------------------------------------------------------------------------------
DBMS_STATS: BVAL RPCNT EAVAL ENVAL EDVAL
DBMS_STATS: ---------------------------------------------------------------------------------------------------
DBMS_STATS: 1 C102 1 Typ=2 Len=2: c1,2
DBMS_STATS: 2 C103 2 Typ=2 Len=2: c1,3
DBMS_STATS: Need Actual Values (DSC_EAVS)
DBMS_STATS: Histogram Type: FREQUENCY Data Type: 2
DBMS_STATS: Histogram Flags: 4100 Histogram Gathering Flags: 1286
DBMS_STATS: Incremental: FALSE Fix Control 13583722: 1
DBMS_STATS: ----------------------------------------------------------------------------------------------------
DBMS_STATS: Number of rows in the table = 1000000, blocks = , average row length = 6, chain count = , scan rate = 0, sample size = 1000000, cstats.count = 4, cind = 4
DBMS_STATS: prepare reporting structures...
DBMS_STATS: gather_ext_col_stats 1: tbl_owner=CBO_TEST tbl_name=TEST_STATS col_name=ID part_name= granularity=GLOBAL AND PARTITION
DBMS_STATS: gather_ext_col_stats 1: tbl_owner=CBO_TEST tbl_name=TEST_STATS col_name=ALMOST_NULL part_name= granularity=GLOBAL AND PARTITION
DBMS_STATS: End of construct analyze using sql.
DBMS_STATS: Finished table CBO_TEST.TEST_STATS. at 19.08.19 17:07:46,464000000 +02:00
DBMS_STATS: reporting_man_update_task: objn: 72922 auto_stats: FALSE status: COMPLETED ctx.batching_coeff: 0

PL/SQL-Prozedur erfolgreich abgeschlossen.

SQL>

So this test case shows that in case of a column having more than a single value the code happily extracts the required information using the ROWIDs collected during the main pass and doesn't require a separate pass (Test cases 3 and 4), but in the special case of a single valued column it bails out and runs a separate query typically using sampling, except for the edge case of a single valued column having NULLs in most of the rows (Test cases 1 and 2).

I've discussed this issue with Nigel Bayliss, the optimizer product manager at Oracle, and he agreed that it looks like a bug - and created a corresponding bug (Bug 30205756 - FREQUENCY HISTOGRAM NOT CREATED USING FULL ROW SAMPLE, which is probably not marked as public), just in case you come across this issue and want to provide this reference to Oracle Support.

The curious case of enq: TX – row lock contention

So we had a doubtful pleasure of migrating a few databases from Oracle Exadata X3-2 to Oracle Exadata X7-2 Cloud at Customer.
Why doubtful? Well, this a material for a whole different story with a lot of beer – let me just say, that CC gen 1 was a bit rough around the edges </p />
</p></div>

    	  	<div class=