Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Ansible tips’n’tricks: executing related tasks together

I have recently written an ansible playbook to apply one-off patches to an Oracle Home. While doing this, I hit a little snag that needed ironing out. Before continuing this post, it’s worth pointing out that I’m on:

$ ansible --version
ansible 2.6.5

And it’s Ansible on Fedora.

Most likely the wrong way to do this …

So after a little bit of coding my initial attempt looked similar to this:

$ cat main.yml 
---
- hosts: 127.0.0.1
  connection: local
  vars:
    - patchIDs:
        - 123
        - 234
        - 456

  tasks:
  - name: say hello
    debug: msg="hello world"

  - name: unzip patch 
    debug: 
      msg: unzipping patch {{ item }}
    loop: "{{ patchIDs }}"
  
  - name: check patch for conflicts with Oracle Home
    debug: 
      msg: checking patch {{ item }} for conflict with $ORACLE_HOME
    loop: "{{ patchIDs }}"
  
  - name: apply patch
    debug: 
      msg: applying patch {{ item }} to $ORACLE_HOME
    loop: "{{ patchIDs }}"
    failed_when: "item == 456"

This is a stub of course … I have stripped any non-essential code from the playbook but it should give you a first idea.

Can you spot the bug

This looks ok-ish, but there’s a (not quite so hidden) bug in there. And no, I didn’t have a failed_when condition in my playbook that would always evaluate to true with this input :) Consider this output:

$ ansible-playbook main.yml 

PLAY [127.0.0.1] ***************************************************************************

TASK [Gathering Facts] *********************************************************************
ok: [127.0.0.1]

TASK [say hello] ***************************************************************************
ok: [127.0.0.1] => {}

MSG:

hello world


TASK [unzip patch] *************************************************************************
ok: [127.0.0.1] => (item=123) => {}

MSG:

unzipping patch 123

ok: [127.0.0.1] => (item=234) => {}

MSG:

unzipping patch 234

ok: [127.0.0.1] => (item=456) => {}

MSG:

unzipping patch 456


TASK [check patch for conflicts with Oracle Home] ******************************************
ok: [127.0.0.1] => (item=123) => {}

MSG:

checking patch 123 for conflict with $ORACLE_HOME

ok: [127.0.0.1] => (item=234) => {}

MSG:

checking patch 234 for conflict with $ORACLE_HOME

ok: [127.0.0.1] => (item=456) => {}

MSG:

checking patch 456 for conflict with $ORACLE_HOME


TASK [apply patch] *************************************************************************
ok: [127.0.0.1] => (item=123) => {}

MSG:

applying patch 123 to $ORACLE_HOME

ok: [127.0.0.1] => (item=234) => {}

MSG:

applying patch 234 to $ORACLE_HOME

failed: [127.0.0.1] (item=456) => {}

MSG:

applying patch 456 to $ORACLE_HOME

fatal: [127.0.0.1]: FAILED! => {}

MSG:

All items completed


PLAY RECAP *********************************************************************************
127.0.0.1                  : ok=4    changed=0    unreachable=0    failed=1

Whoops, that went wrong! As you would predict, the last task failed.

The simulated conflict check is performed for each patch, before the patch is applied in the next step. And this is where the bug in the code can hit you. Imagine you are on a recent PSU, let’s say 180717. The playbook:

  1. Checks patch 123 for incompatibilities with PSU 180717
  2. Followed by patch# 234 …
  3. and eventually 456.

No issues are detected. The next step is to apply patch 123 on top of our fictional PSU 180717, followed by 234 on top of 180717 plus 123, and so on. When it comes to patch 456, a conflict is detected: opatch tells you that you can’t apply 456 on top of 180717 + 234 … I have simulated this with the failed_when clause. The bug in this case is my playbook failing to detect a conflict before actually trying to apply a patch.

I need a procedure!

So what now? In a shell script, I would have defined a function, maybe called it apply_patch. It’s task include the unzipping, checking pre-requisites, and eventually the call to opatch apply. In the body of the script I would have looped over all patches to apply and called apply_patch() for each patch to be applied. In other words unzip/check prerequisites/apply are always performed for a given patch before the code advances to the next patch in sequence.

But how can this be done in Ansible? A little bit of research (and a conversation with @fritshoogland who confirmed what I thought was to be changed) later I noticed that you can include tasks and pass variables to them. So what if I rewrote my code to take advantage of that feature? Here’s the end result:

$ cat main.yml 
---
- hosts: 127.0.0.1
  connection: local
  vars:
    - patchIDs:
        - 123
        - 234
        - 456

  tasks:
  - name: say hello
    debug: msg="hello world"

  - name: include a task
    include_tasks: includedtask.yml
    loop: "{{ patchIDs }}"

Note that I’m using the loop syntax recommended from Ansible 2.5 and later (see section “Migrating from with_X to loop”).

The include file references the variable passed as {{ item }}.

$ cat includedtask.yml 
---
- name: unzip patch {{ item }}
  debug: 
    msg: unzipping patch {{ item }}

- name: check patch {{ item }} for conflicts with Oracle Home
  debug: 
    msg: checking patch {{ item }} for conflict with $ORACLE_HOME

- name: apply patch {{ item }}
  debug: 
    msg: applying patch {{ item }} to $ORACLE_HOME
  failed_when: "item == 456"

Now if I run this playbook, each task (unzip/check prerequisites/apply) is executed for each individual patch.

$ ansible-playbook main.yml

PLAY [127.0.0.1] ***************************************************************************

TASK [Gathering Facts] *********************************************************************
ok: [127.0.0.1]

TASK [say hello] ***************************************************************************
ok: [127.0.0.1] => {}

MSG:

hello world


TASK [include a task] **********************************************************************
included: /home/martin/ansible/blogpost/including_task/better/includedtask.yml for 127.0.0.1
included: /home/martin/ansible/blogpost/including_task/better/includedtask.yml for 127.0.0.1
included: /home/martin/ansible/blogpost/including_task/better/includedtask.yml for 127.0.0.1

TASK [unzip patch 123] *********************************************************************
ok: [127.0.0.1] => {}

MSG:

unzipping patch 123


TASK [check patch 123 for conflicts with Oracle Home] **************************************
ok: [127.0.0.1] => {}

MSG:

checking patch 123 for conflict with $ORACLE_HOME


TASK [apply patch 123] *********************************************************************
ok: [127.0.0.1] => {}

MSG:

applying patch 123 to $ORACLE_HOME


TASK [unzip patch 234] *********************************************************************
ok: [127.0.0.1] => {}

MSG:

unzipping patch 234


TASK [check patch 234 for conflicts with Oracle Home] **************************************
ok: [127.0.0.1] => {}

MSG:

checking patch 234 for conflict with $ORACLE_HOME


TASK [apply patch 234] *********************************************************************
ok: [127.0.0.1] => {}

MSG:

applying patch 234 to $ORACLE_HOME


TASK [unzip patch 456] *********************************************************************
ok: [127.0.0.1] => {}

MSG:

unzipping patch 456


TASK [check patch 456 for conflicts with Oracle Home] **************************************
ok: [127.0.0.1] => {}

MSG:

checking patch 456 for conflict with $ORACLE_HOME


TASK [apply patch 456] *********************************************************************
fatal: [127.0.0.1]: FAILED! => {}

MSG:

applying patch 456 to $ORACLE_HOME


PLAY RECAP *********************************************************************************
127.0.0.1                  : ok=13   changed=0    unreachable=0    failed=1 

This way, I should be able to stop the playbook as soon as the pre-requisite conflict checker has completed.

opatch versions

This blogpost is about oracle’s patching tool for the database and grid infrastructure: opatch. I personally have a love/hate relationship with opatch. In essence, opatch automates a lot of things that would be very error prone if it were to be done by hand, which is a good thing. However, there are a lot of things surrounding opatch that I despise. An example of that is the version numbering of opatch itself.

Versions and more versions

To jump into the matter directly: versions. I don’t understand why this has to be this complicated. I would even go as far as saying that somebody needs to step in and clean this up.

All opatch versions are identified by patch number 6880880. Anyone working with patching oracle probably knows this patch number by heart. You can go to this patch very quickly in MOS if you go to the search box which sits at the right side of the screen on the same height as the tabs, and type ‘patch 6880880’. A green rectangle will say ‘You have been directed to this PATCH based on an ID match’… So far the easy part.

On the right side below that, there is a rectangle, for which the first dropdown box is ‘release’. The release here means the release for which you want to download opatch. There are a great number of releases here:
– OPatch 12.2.0.1.0
– OPatch 13.9.0.0.0                           – EMCC
– OPatch 18.0.0.0.0
– OPatch for FMW 12c (OUI 13.9.x)   – Obsolete FMW(?)
– OUI NextGen 13.1                           – FMW
– OUI NextGen 13.2                           – FMW
– Oracle 10.1.0.0.0
– Oracle 10.1.0.3                                – Obsolete
– Oracle 10.2.0.0.0
– Oracle 11.1.0.0.0
– Oracle 11.2.0.0.0
– Oracle 12.1.0.1.0
– Oracle 12.1.0.1.1                             – Obsolete
– Oracle Database 12.2.0.1                – Obsolete

The red entries describe the patch being obsolete, and not downloadable. IMO, they should not be there.
The blue entries are non-database opatch releases. I would beg oracle to publish non-database opatch releases under their own patch number.

This eliminates 7 of the 14 choices. The left over entries show a list of versions that seems to make sense, these seem to show the database versions for which you can download a version of opatch. Let’s look at the patch descriptions of the leftover versions:

– OPatch 12.2.0.1.0    – Patch 6880880: OPatch 12.2.0.1.14 for DB 12.2 releases (JUL 2018)
– OPatch 18.0.0.0.0    – Patch 6880880: OPatch 12.2.0.1.14 for DB 18.x releases (JUL 2018)
– Oracle 10.1.0.0.0     – Patch 6880880: OPatch 9i, 10.1
– Oracle 10.2.0.0.0     – Patch 6880880: OPatch 10.2
– Oracle 11.1.0.0.0     – Patch 6880880: OPatch patch of version 11.1.0.12.9 for Oracle software releases 11.1.0.x (OCT 2015)
– Oracle 11.2.0.0.0     – Patch 6880880: OPatch patch of version 11.2.0.3.19 for Oracle software releases 11.2.0.x and 18.x (APR 2018)
– Oracle 12.1.0.1.0     – Patch 6880880: OPatch 12.2.0.1.14 for DB 12.x and 18.x releases (JUL 2018)

If we follow the versions from the lowest to the highest versions of Opatch:
– 10.1.0.0.0     -> ‘OPatch 9i, 10.1’ This is opatch for database versions 9i and 10.1.
– 10.2.0.0.0     -> ‘OPatch 10.2’ This is opatch for database version 10.2.
– 11.1.0.0.0     -> ‘OPatch patch of version 11.1.0.12.9 for Oracle software releases 11.1.0.x (OCT 2015)’ This is opatch for database version 11.1.0.x.
– 11.2.0.0.0     -> ‘OPatch patch of version 11.2.0.3.19 for Oracle software releases 11.2.0.x and 18.x (APR 2018)’ This is opatch for database version 11.2.0.x. I have no idea what ‘and 18.x’ means.
– 12.1.0.1.0     -> ‘OPatch patch 12.2.0.1.14 for DB 12.x and 18.x releases (JUL 2018)’ This is opatch for database version 12.1.x. NOT 12.x as the patch text indicates, it’s 12.1.x, and again I have no idea what ‘and 18.x’ means. I also don’t understand why the the 4th number is getting used all of a sudden.
– 12.2.0.1.0    -> ‘OPatch 12.2.0.1.14 for DB 12.2 releases (JUL 2018)’ This is opatch for database version 12.2. The description now aptly describes the database version and does not confuse.
– 18.0.0.0.0    -> ‘OPatch 12.2.0.1.14 for DB 18.x releases (JUL 2018)’ This is opatch for database version 18. The description again now aptly describes the version.

We are not there yet. Now that the patch release that should be used for every version of the database has been identified, let me throw in another version. This is the actual version that opatch reports when it is queried with ‘opatch version’. Let me list the opatch versions:
– 11.1.0.0.0 -> opatch version 11.1.0.12.9
– 11.2.0.0.0 -> opatch version 11.2.0.3.19
– 12.1.0.1.0 -> opatch version 12.2.0.1.14
– 12.2.0.1.0 -> opatch version 12.2.0.1.14
– 18.0.0.0.0 -> opatch version 12.2.0.1.14
Yes, this is again some weirdness. The opatch version seemed to have followed the database version, and -apparently- starting from version 12.1 up to version 18 there is one opatch version: 12.2.

That begs the question whether the 12.1.0.1.0, 12.2.0.1.0 and 18.0.0.0.0 opatch versions is actually the same opatch version. I got these versions downloaded, with the opatch version included in the name:

$ md5sum p6880880*
1ee44d25f5e858eb67424b69b89b8a25  p6880880_121010_Linux-x86-64-12.2.0.1.14.zip
1ee44d25f5e858eb67424b69b89b8a25  p6880880_122010_Linux-x86-64-12.2.0.1.14.zip
1ee44d25f5e858eb67424b69b89b8a25  p6880880_180000_Linux-x86-64-12.2.0.1.14.zip

Okay…so these are actually 100% identical copies, which just have a different name. I have no idea why the exact same file is offered with a different name. This means that the addition of ‘and 18.x’ in the patch description of the release means that that opatch version can also be used with version 18 of the database.

To be honest, the original intention of this blog article was to describe the some specific usage of opatch, and I now already got enough content for a post, and will retain the original intended content for a next blogpost.

The question you might be having at this point is: but what version should I download now? For databases up to version 11.2 it is simple, there is actually only one choice for for each version. For database versions starting from version 12.1 you -currently- have a choice. I would advise simply downloading the “correct” opatch version for every database version, which means the exact oracle version as indicated in the release pulldown menu at the download page (‘Oracle 12.1.0.1.0’, ‘OPatch 12.2.0.1.0’ or ‘OPatch 18.0.0.0.0’) despite the fact that these are the same opatch versions *currently*.

I suspect that once 12.1 goes into extended support, the opatch version will freeze, whilst the opatch version for version 12.2 and 18 will be improved and increase in version. So the only way to download the correct opatch version is still by choosing the actual database version it is intended for.

This brings me to another important thing to realise: the actual opatch version that is downloaded. At the time of writing the current and only available opatch release for Oracle database versions 12.1 to 18 is 12.2.0.1.14. Once oracle brings out a newer opatch version, the previous version will not be available anywhere anymore (as far as I know). To me this means that if you patch databases per tier (from test, development and so on up to production), you have to stage opatch in order to be able to use the same opatch version consistently in the future. Of course Oracle advises to use the latest opatch version, but the patch will check for a minimal opatch version, and if you tested your database software version, opatch version and patch to be working correctly together, in my opinion the most important thing is consistency.

Oracle LIKE predicate and cardinality estimations

A new blog post on the Databases at CERN blog about the optimizer estimations with a LIKE predicate: https://db-blog.web.cern.ch/blog/franck-pachot/2018-11-oracle-predicate-and-cardinality-estimations.

Or why the cardinality estimation here is 1346 objects like ‘DBA_OBJECTS’ rather than 2:

SQL> select count(*) from OBJ$ where name LIKE 'DBA_OBJECTS';
COUNT(*)
----------
2
Execution Plan
--------------------------------------------------------
Plan hash value: 2699154286
--------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |
--------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 19 |
| 1 | SORT AGGREGATE | | 1 | 19 |
|* 2 | INDEX FAST FULL SCAN| I_OBJ2 | 1346 | 25574 |
--------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("NAME" LIKE 'DBA_OBJECTS')

In summary, This LIKE ‘DBA_OBJECT’, because ‘_’ is a wildcard, is considered by the optimizer as a LIKE ‘DBA%’ for cardinality estimation.

Oracle LIKE predicate and cardinality estimations

Column And Table Redefinition With Minimal Locking

TLDR: Note to future self… (1) Read this before you modify a table on a live PostgreSQL database. If you do it wrong then your app might totally hang. There is a right way to do it which avoids that. (2) Especially remember the lock_timeout step. Many blog posts around the ‘net are missing this and it’s very important.

Recently I was chatting with some PostgreSQL users (who, BTW, were doing rather large-scale cool stuff in PG) and they asked a question about making schema changes with minimal impact to the running application. They were specifically curious about changing a primary key from INT to BIGINT.  (Oh, you are making all your new PK fields BIGINT right?)

And then, lo and behold, I discovered a chat today on the very same topic. Seemed useful enough to file away on my blog so that I can find it later. BTW I got permission from Jim Nasby, Jim F and Robins Tharakan to blame them for this…  ;)

Most useful part of the chat was how to think about doing table definition changes in PostgreSQL with minimal application impact due to locking:

  1. Use lock_timeout.
    1. Can be set at the session level.
  2. For changes that do more than just a quick metadata update, work with copies.
    1. Create a new column & drop old column instead of modifying.
    2. Or create a new table & drop old table.
    3. Use triggers to keep data in sync.
    4. Carefully leverage transactional DDL (PostgreSQL rocks here!) to make changes with no windows for missing data.

We can follow this line of thought even for a primary key – creating a unique index on the new column, using existing index to update table constraints, then dropping old column.

One of the important points here is making sure that operations which require locks are metadata-only. That is, they don’t need to actually modify any data (while holding said lock) for example rewriting or scanning the table. We want these ops to run very very fast, and even time out if they still can’t run fast enough.

A few minutes on google yields proof that Jim Nasby was right: lots of people have already written up some really good advice about this topic.  Note that (as always) you should be careful about dates and versions in stuff you find yourself.  Anything pre-2014 should be scrutinized very carefully (PostgreSQL has changed a lot since then); and for the record, PostgreSQL 11 changes this specific list again (and none of these articles seem to be updated for pg11 yet). And should go without saying, but test test test…

  • This article from BrainTree is my favorite of what I saw this morning. Concise yet clear list of green-light and red-light scenarios, with workaround for all the red lights.
    • Add a new column, Drop a column, Add an index concurrently, Drop a constraint (for example, non-nullable), Add a default value to an existing column, Add an index, Change the type of a column, Add a column with a default, Add a column that is non-nullable, Add a column with a unique constraint, VACUUM FULL
  • Citus has a practical tips article that’s linked pretty widely.
    • adding a column with a default value, using lock timeouts, Create indexes, Taking aggressive locks, Adding a primary key, VACUUM FULL, ordering commands
  • Alexey Vasiliev assembled a list in 2016 which is worth reviewing.
    • Add a new column, Add a column with a default, Add a column that is non-nullable, Drop a column, Change the type of a column, Add a default value to an existing column, Add an index, Add a column with a unique constraint, Drop a constraint, VACUUM FULL, ALTER TABLE SET TABLESPACE
  • Joshua Kehn put together a good article in late 2017 that especially illustrates the importance of using lock_timeout (though he doesn’t mention it in the article)
    • Default values for new columns, Adding a default value on an existing column, Concurrent index creation, ALTER TABLE, importance of typical transaction length

For fun and posterity, here’s the original chat (which has a little more detail) where they gave me these silly ideas:

[11/08/18 09:01] Colleague1: I have a question with regard to APG. How can we make DDL modifications to a table with minimalistic locking (downtime)?
[11/08/18 09:31] Jim N: It depends on the modification you're trying to make. Many forms of ALTER TABLE are very fast. Some don't even require an exclusive lock.
[11/08/18 09:32] Jim N: What you have to be careful of are alters that will force a rewrite of the entire table. Common examples of that are adding a new column that has a default value, or altering the type of an existing column.
[11/08/18 09:33] Jim N: What I've done in the past for those scenarios is to create a new field (that's null), put a before insert or update trigger on the table to maintain that field.
[11/08/18 09:33] Jim N: Then run a "backfill" that processes a few hundred / thousand rows per transaction, with a delay between each batch.
[11/08/18 09:34] Jim N: Once I know that all rows in the table have been properly updated, drop the old row, and maybe mark the new row as NOT NULL.
[11/08/18 09:43] Jim N: btw, I know there's been a talk about this at a conference in the last year or two...
[11/08/18 09:49] Jim F: What happens at the page level if the default value of an ALTER TABLE ADD COLUMN is null? Once upon a time when I worked at [a commercialized fork of PostgreSQL], which was built on a version of PostgreSQL circa 2000, I recall that the table would be versioned. This was a pure metadata change, but the added columns would be created for older-version rows on read, and probably updated on write. Is that how it currently works?
[11/08/18 09:55] Jim N: Jim F in essence, yes.
[11/08/18 09:56] Jim N: Though I wouldn't describe it as being "versioned"
[11/08/18 09:57] Jim N: But because columns are always added to the end of the tuple (and we never delete from pg_attribute), heap_deform_tuple can detect if a tuple is "missing" columns at the end of the tuple and just treat them as being null.
[11/08/18 09:57] Jim N: At least I'm pretty sure that's what's going on, without actually re-reading the code right now.

Column And Table Redefinition With Minimal Locking

TLDR: Note to future self… (1) Read this before you modify a table on a live PostgreSQL database. If you do it wrong then your app might totally hang. There is a right way to do it which avoids that. (2) Especially remember the lock_timeout step. Many blog posts around the ‘net are missing this and it’s very important.

Recently I was chatting with some PostgreSQL users (who, BTW, were doing rather large-scale cool stuff in PG) and they asked a question about making schema changes with minimal impact to the running application. They were specifically curious about changing a primary key from INT to BIGINT.  (Oh, you are making all your new PK fields BIGINT right?)

And then, lo and behold, I discovered a chat today on the very same topic. Seemed useful enough to file away on my blog so that I can find it later. BTW I got permission from Jim Nasby, Jim F and Robins Tharakan to blame them for this…  ;)

Most useful part of the chat was how to think about doing table definition changes in PostgreSQL with minimal application impact due to locking:

  1. Use lock_timeout.
    1. Can be set at the session level.
  2. For changes that do more than just a quick metadata update, work with copies.
    1. Create a new column & drop old column instead of modifying.
    2. Or create a new table & drop old table.
    3. Use triggers to keep data in sync.
    4. Carefully leverage transactional DDL (PostgreSQL rocks here!) to make changes with no windows for missing data.

We can follow this line of thought even for a primary key – creating a unique index on the new column, using existing index to update table constraints, then dropping old column.

One of the important points here is making sure that operations which require locks are metadata-only. That is, they don’t need to actually modify any data (while holding said lock) for example rewriting or scanning the table. We want these ops to run very very fast, and even time out if they still can’t run fast enough.

A few minutes on google yields proof that Jim Nasby was right: lots of people have already written up some really good advice about this topic.  Note that (as always) you should be careful about dates and versions in stuff you find yourself.  Anything pre-2014 should be scrutinized very carefully (PostgreSQL has changed a lot since then); and for the record, PostgreSQL 11 changes this specific list again (and none of these articles seem to be updated for pg11 yet). And should go without saying, but test test test…

  • This article from BrainTree is my favorite of what I saw this morning. Concise yet clear list of green-light and red-light scenarios, with workaround for all the red lights.
    • Add a new column, Drop a column, Add an index concurrently, Drop a constraint (for example, non-nullable), Add a default value to an existing column, Add an index, Change the type of a column, Add a column with a default, Add a column that is non-nullable, Add a column with a unique constraint, VACUUM FULL
  • Citus has a practical tips article that’s linked pretty widely.
    • adding a column with a default value, using lock timeouts, Create indexes, Taking aggressive locks, Adding a primary key, VACUUM FULL, ordering commands
  • Alexey Vasiliev assembled a list in 2016 which is worth reviewing.
    • Add a new column, Add a column with a default, Add a column that is non-nullable, Drop a column, Change the type of a column, Add a default value to an existing column, Add an index, Add a column with a unique constraint, Drop a constraint, VACUUM FULL, ALTER TABLE SET TABLESPACE
  • Joshua Kehn put together a good article in late 2017 that especially illustrates the importance of using lock_timeout (though he doesn’t mention it in the article)
    • Default values for new columns, Adding a default value on an existing column, Concurrent index creation, ALTER TABLE, importance of typical transaction length

For fun and posterity, here’s the original chat (which has a little more detail) where they gave me these silly ideas:

[11/08/18 09:01] Colleague1: I have a question with regard to APG. How can we make DDL modifications to a table with minimalistic locking (downtime)?
[11/08/18 09:31] Jim N: It depends on the modification you're trying to make. Many forms of ALTER TABLE are very fast. Some don't even require an exclusive lock.
[11/08/18 09:32] Jim N: What you have to be careful of are alters that will force a rewrite of the entire table. Common examples of that are adding a new column that has a default value, or altering the type of an existing column.
[11/08/18 09:33] Jim N: What I've done in the past for those scenarios is to create a new field (that's null), put a before insert or update trigger on the table to maintain that field.
[11/08/18 09:33] Jim N: Then run a "backfill" that processes a few hundred / thousand rows per transaction, with a delay between each batch.
[11/08/18 09:34] Jim N: Once I know that all rows in the table have been properly updated, drop the old row, and maybe mark the new row as NOT NULL.
[11/08/18 09:43] Jim N: btw, I know there's been a talk about this at a conference in the last year or two...
[11/08/18 09:49] Jim F: What happens at the page level if the default value of an ALTER TABLE ADD COLUMN is null? Once upon a time when I worked at [a commercialized fork of PostgreSQL], which was built on a version of PostgreSQL circa 2000, I recall that the table would be versioned. This was a pure metadata change, but the added columns would be created for older-version rows on read, and probably updated on write. Is that how it currently works?
[11/08/18 09:55] Jim N: Jim F in essence, yes.
[11/08/18 09:56] Jim N: Though I wouldn't describe it as being "versioned"
[11/08/18 09:57] Jim N: But because columns are always added to the end of the tuple (and we never delete from pg_attribute), heap_deform_tuple can detect if a tuple is "missing" columns at the end of the tuple and just treat them as being null.
[11/08/18 09:57] Jim N: At least I'm pretty sure that's what's going on, without actually re-reading the code right now.

Column And Table Redefinition With Minimal Locking

TLDR: Note to future self… (1) Read this before you modify a table on a live PostgreSQL database. If you do it wrong then your app might totally hang. There is a right way to do it which avoids that. (2) Especially remember the lock_timeout step. Many blog posts around the ‘net are missing this and it’s very important.

Recently I was chatting with some PostgreSQL users (who, BTW, were doing rather large-scale cool stuff in PG) and they asked a question about making schema changes with minimal impact to the running application. They were specifically curious about changing a primary key from INT to BIGINT.  (Oh, you are making all your new PK fields BIGINT right?)

And then, lo and behold, I discovered a chat today on the very same topic. Seemed useful enough to file away on my blog so that I can find it later. BTW I got permission from Jim Nasby, Jim F and Robins Tharakan to blame them for this…  ;)

Most useful part of the chat was how to think about doing table definition changes in PostgreSQL with minimal application impact due to locking:

  1. Use lock_timeout.
    1. Can be set at the session level.
  2. For changes that do more than just a quick metadata update, work with copies.
    1. Create a new column & drop old column instead of modifying.
    2. Or create a new table & drop old table.
    3. Use triggers to keep data in sync.
    4. Carefully leverage transactional DDL (PostgreSQL rocks here!) to make changes with no windows for missing data.

We can follow this line of thought even for a primary key – creating a unique index on the new column, using existing index to update table constraints, then dropping old column.

One of the important points here is making sure that operations which require locks are metadata-only. That is, they don’t need to actually modify any data (while holding said lock) for example rewriting or scanning the table. We want these ops to run very very fast, and even time out if they still can’t run fast enough.

A few minutes on google yields proof that Jim Nasby was right: lots of people have already written up some really good advice about this topic.  Note that (as always) you should be careful about dates and versions in stuff you find yourself.  Anything pre-2014 should be scrutinized very carefully (PostgreSQL has changed a lot since then); and for the record, PostgreSQL 11 changes this specific list again (and none of these articles seem to be updated for pg11 yet). And should go without saying, but test test test…

  • This article from BrainTree is my favorite of what I saw this morning. Concise yet clear list of green-light and red-light scenarios, with workaround for all the red lights.
    • Add a new column, Drop a column, Add an index concurrently, Drop a constraint (for example, non-nullable), Add a default value to an existing column, Add an index, Change the type of a column, Add a column with a default, Add a column that is non-nullable, Add a column with a unique constraint, VACUUM FULL
  • Citus has a practical tips article that’s linked pretty widely.
    • adding a column with a default value, using lock timeouts, Create indexes, Taking aggressive locks, Adding a primary key, VACUUM FULL, ordering commands
  • Alexey Vasiliev assembled a list in 2016 which is worth reviewing.
    • Add a new column, Add a column with a default, Add a column that is non-nullable, Drop a column, Change the type of a column, Add a default value to an existing column, Add an index, Add a column with a unique constraint, Drop a constraint, VACUUM FULL, ALTER TABLE SET TABLESPACE
  • Joshua Kehn put together a good article in late 2017 that especially illustrates the importance of using lock_timeout (though he doesn’t mention it in the article)
    • Default values for new columns, Adding a default value on an existing column, Concurrent index creation, ALTER TABLE, importance of typical transaction length

For fun and posterity, here’s the original chat (which has a little more detail) where they gave me these silly ideas:

[11/08/18 09:01] Colleague1: I have a question with regard to APG. How can we make DDL modifications to a table with minimalistic locking (downtime)?
[11/08/18 09:31] Jim N: It depends on the modification you're trying to make. Many forms of ALTER TABLE are very fast. Some don't even require an exclusive lock.
[11/08/18 09:32] Jim N: What you have to be careful of are alters that will force a rewrite of the entire table. Common examples of that are adding a new column that has a default value, or altering the type of an existing column.
[11/08/18 09:33] Jim N: What I've done in the past for those scenarios is to create a new field (that's null), put a before insert or update trigger on the table to maintain that field.
[11/08/18 09:33] Jim N: Then run a "backfill" that processes a few hundred / thousand rows per transaction, with a delay between each batch.
[11/08/18 09:34] Jim N: Once I know that all rows in the table have been properly updated, drop the old row, and maybe mark the new row as NOT NULL.
[11/08/18 09:43] Jim N: btw, I know there's been a talk about this at a conference in the last year or two...
[11/08/18 09:49] Jim F: What happens at the page level if the default value of an ALTER TABLE ADD COLUMN is null? Once upon a time when I worked at [a commercialized fork of PostgreSQL], which was built on a version of PostgreSQL circa 2000, I recall that the table would be versioned. This was a pure metadata change, but the added columns would be created for older-version rows on read, and probably updated on write. Is that how it currently works?
[11/08/18 09:55] Jim N: Jim F in essence, yes.
[11/08/18 09:56] Jim N: Though I wouldn't describe it as being "versioned"
[11/08/18 09:57] Jim N: But because columns are always added to the end of the tuple (and we never delete from pg_attribute), heap_deform_tuple can detect if a tuple is "missing" columns at the end of the tuple and just treat them as being null.
[11/08/18 09:57] Jim N: At least I'm pretty sure that's what's going on, without actually re-reading the code right now.

Where / Having

There’s a very old mantra about the use of the “having” clause that tells us that if it’s valid (i.e. will always give the same results) then any predicate that could be moved from the having clause to the where clause should be moved. In recent versions of Oracle the optimizer will do this for itself in some cases but (for reasons that I’m not going to mention) I came across a silly example recently where a little manual editing produced a massive performance improvement.

Here’s a quick demo:


rem
rem     Script:         where_having.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Oct 2018
rem     Purpose:
rem
rem     Last tested
rem             18.3.0.0
rem             12.2.0.1
rem             11.2.0.4
rem

reate table t1
as
select * 
from all_objects 
where rownum <= 50000   -- > comment to avoid WordPress format issue
;

spool where_having.lst

set serveroutput off

select /*+ gather_plan_statistics */ 
        object_type, count(*) 
from    t1 
group by 
        object_type 
having  count(*) > 0 
and     1 = 2
;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
;

The big question is: will Oracle do a full tablescan of t1, or will it apply a “null is not null” filter early to bypass that part of the plan. Here’s the plan pulled from memory, with run-time statistics (all versions from 11g to 18c):


--------------------------------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |      0 |00:00:00.02 |     957 |    955 |       |       |          |
|*  1 |  FILTER             |      |      1 |        |      0 |00:00:00.02 |     957 |    955 |       |       |          |
|   2 |   HASH GROUP BY     |      |      1 |      1 |     27 |00:00:00.02 |     957 |    955 |  1186K|  1186K| 1397K (0)|
|   3 |    TABLE ACCESS FULL| T1   |      1 |  50000 |  50000 |00:00:00.01 |     957 |    955 |       |       |          |
--------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter((COUNT(*)>0 AND 1=2))


As you can see, the filter at operation 1 includes the contradiction “1=2”, but Oracle tests this only after doing the full tablescan and aggregation. If you move the “1=2” into the where clause the tablescan doesn’t happen.

Interestingly, if you write the query with an in-line view and trailing where clause:


select /*+ gather_plan_statistics */
        *
from    (
        select
                object_type, count(*)
        from    t1
        group by
                object_type
        having  count(*) > 0
        )
where
        1 = 2
;

The optimizer is clever enough to push the final predicate inside the view (where you might expect it to become part of the having clause) and push it all the way down into a where clause on the base table.


-----------------------------------------------------------------------------
| Id  | Operation            | Name | Starts | E-Rows | A-Rows |   A-Time   |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |      1 |        |      0 |00:00:00.01 |
|*  1 |  FILTER              |      |      1 |        |      0 |00:00:00.01 |
|   2 |   HASH GROUP BY      |      |      1 |      1 |      0 |00:00:00.01 |
|*  3 |    FILTER            |      |      1 |        |      0 |00:00:00.01 |
|   4 |     TABLE ACCESS FULL| T1   |      0 |  50000 |      0 |00:00:00.01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(COUNT(*)>0)
   3 - filter(NULL IS NOT NULL)



A quirky case of the optimizer handling the (apparently) more complex query than it does the simpler query.

Add ORDER BY to make ANY query faster

Yes it’s SCBT day here in Perth!

SCBT = Silly Click Bait Title Smile

This post is just a cautionary tale that it is easy to get caught up judging SQL performance solely on a few metrics rather than taking a more common sense approach of assessing performance based on the true requirements of the relevant component of the application.  I say “true requirements” because it may vary depending on what is important to the application for a particular component.

For the majority of the time, response time is most probably the thing you will care about most. Certainly for customer-facing components of an application, response time is likely to be the dominant factor in defining application “success”. But it does not necessarily always need to be response time. You might have a SQL statement that gets executed millions (or billions) of times, and hence perhaps parse time CPU or overall latching activity might be a critical factor.

My point is this: Not all SQL statements can be simply judged on a rule of “less consistent gets = a better SQL”. Here’s an example that came up on AskTOM recently, hence the click bait title. I’ll start with 100 copies of DBA_OBJECTS to be my source table.


SQL> create table t pctfree 0
  2  as select owner, object_id, subobject_name
  3  from dba_objects,
  4       ( select 1 from dual connect by level <= 100 );

Table created.

SQL>
SQL> exec dbms_stats.gather_table_stats('','t');

PL/SQL procedure successfully completed.

We’ll connect to SQL Plus and just run a simple SELECT to retrieve all of the rows.


SQL> set autotrace traceonly statistics
SQL> select * from t;

8350600 rows selected.


Statistics
------------------------------------------------------
         13  recursive calls
         17  db block gets
     849818  consistent gets
          1  physical reads
       3024  redo size
  294152578  bytes sent via SQL*Net to client
    9186257  bytes received via SQL*Net from client
     835061  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
    8350600  rows processed

Wow…that’s a lot of consistent gets. Here’s my simple solution to reduce that. We’ll just an ORDER BY clause!


SQL> select * from t order by object_id;

8350600 rows selected.


Statistics
-------------------------------------------------------
        170  recursive calls
          4  db block gets
      16469  consistent gets
      21597  physical reads
          0  redo size
  201757283  bytes sent via SQL*Net to client
    9186257  bytes received via SQL*Net from client
     835061  SQL*Net roundtrips to/from client
          0  sorts (memory)
          1  sorts (disk)
    8350600  rows processed

Woo hoo! Look how much I’ve dropped the consistent gets. Time for me to scream from the roof tops “I am a SQL tuning genius!” Smile

In reality, I don’t need an ORDER BY to improve the “consistent gets” number from the SELECT statement.  One of the governing factors for how many consistent gets we will need to do is the fetch size, and my original execution was performed with a fetch size (arraysize in SQL Plus) of just 10. Let’s bump that up and re-run the query:


SQL> set arraysize 1000
SQL> select * from t;

8350600 rows selected.


Statistics
------------------------------------------------------
          0  recursive calls
          0  db block gets
      24805  consistent gets
          0  physical reads
          0  redo size
  133771032  bytes sent via SQL*Net to client
      92458  bytes received via SQL*Net from client
       8352  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
    8350600  rows processed

Voila! An easy “fix” to consistent gets.  In reality, don’t forget that consistent gets is just one piece of the performance performance puzzle. If I throw in some response time metrics for the above queries, then (as an application user) I know which query I would prefer to have running!


SQL> set autotrace off
SQL> set timing on
SQL> set feedback only
SQL> select * from t;

8350600 rows selected.

Elapsed: 00:00:02.08
SQL> select * from t order by object_id;

8350600 rows selected.

Elapsed: 00:00:09.70
SQL>

So, click bait titles aside, make sure you don’t get too carried away just focussing on particular metrics (eg consistent gets) when assessing and tuning the performance of your SQL statements.

Interval versus Range partitions

One of the nice things about partitioning in the database is that partition pruning can quickly eliminate the requirement to read large amounts of data when the partitioning key(s) are appropriately provided in your queries (and DML). This also extends to queries where the values provided do not map to any partitions.

Here’s a simple example of a range partitioned table which only cover the values from (less than) 1 to a ceiling of 6.



SQL> create table t
  2  partition by range(x)
  3  ( partition p0 values less than (1),
  4    partition p1 values less than (2),
  5    partition p2 values less than (3),
  6    partition p3 values less than (4),
  7    partition p4 values less than (5),
  8    partition p5 values less than (6)
  9  )
 10  as
 11  select rownum x from dual connect by level <= 5;

Table created.

Let’s see what happens when I run a query for a value that by definition cannot possibly exist in the table.



SQL> select * from t where x = 1235;

no rows selected

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------
SQL_ID  auv890vfz95vu, child number 0
-------------------------------------
select * from t where x = 1235

Plan hash value: 4294331813

----------------------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |       |       |     2 (100)|          |       |       |
|   1 |  PARTITION RANGE EMPTY|      |     1 |     3 |     2   (0)| 00:00:01 |INVALID|INVALID|
|*  2 |   TABLE ACCESS FULL   | T    |     1 |     3 |     2   (0)| 00:00:01 |INVALID|INVALID|
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("X"=1235)

That’s pretty cool. You can see that terms “EMPTY” and “INVALID” in the plan. The database knows from the dictionary definitions of the partitions that there is no possible data for X=1235 and hence the table partitions do not need to be accessed at all.

I’ll now repeat the example, but using an INTERVAL partitioned table. I only need to define the first partition, but since the interval size is 1, it will create partitions identical to the range example above.



SQL> create table t
  2  partition by range(x) interval(1)
  3  ( partition p0 values less than (1) )
  4  as
  5  select rownum x from dual connect by level <= 5;

Table created.

SQL>
SQL> select * from t where x = 1235;

no rows selected

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------
SQL_ID  auv890vfz95vu, child number 0
-------------------------------------
select * from t where x = 1235

Plan hash value: 2931986080

-----------------------------------------------------------------------------------------------
| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |      |       |       |     6 (100)|          |       |       |
|   1 |  PARTITION RANGE SINGLE|      |     1 |     3 |     6   (0)| 00:00:01 |  1236 |  1236 |
|*  2 |   TABLE ACCESS FULL    | T    |     1 |     3 |     6   (0)| 00:00:01 |  1236 |  1236 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("X"=1235)

Notice that the results here are slightly different. Even though no partition exists for X=1235, for an interval partitioned table the optimizer suggests that it will scan the (fictitious) partition 1236. Is this a bug?

To explain this, I like to think of interval partitioning as an extension of the deferred segment implementation we already have in the database. The database can defer creating a physical segment when a table (or table partition) is created. The segment only needs to come into existence when the first row is created for that table. The way I interpret interval partitioning, is that it takes deferring segments to another level, namely deferring the dictionary metadata as well until it is needed. You can think of interval partition table DDL as creating all of the partitions in the table immediately (ie, they all exist) but the dictionary metadata to support them is deferred until required. Hence partition 1236 in the example does indeed exist – it is just that we have not yet created the dictionary entries for it yet – we won’t do so until the first row is created.

Based on this concept, you can still see the concept of EMPTY/INVALID partitions for a query on an interval partitioned table if you nominate a value that would yield a non-existent partition. You may be thinking “How can we have a non-existent partition for an interval partitioned table?”. Don’t forget that there is an upper limit on the number of partitions you can have for any partitioned table, so all I need do is nominate a value for X that would yield a partition above that upper bound.



SQL> select * from t where x = 1235123123;

no rows selected

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------
SQL_ID  dfthyzy792fah, child number 0
-------------------------------------
select * from t where x = 1235123123

Plan hash value: 4294331813

----------------------------------------------------------------------------------------------
| Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |      |       |       |     2 (100)|          |       |       |
|   1 |  PARTITION RANGE EMPTY|      |     1 |     3 |     2   (0)| 00:00:01 |INVALID|INVALID|
|*  2 |   TABLE ACCESS FULL   | T    |     1 |     3 |     2   (0)| 00:00:01 |INVALID|INVALID|
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("X"=1235123123)

FIRST_ROWS_10 CBO Is Hopeless, It’s Using The Wrong Index !! (Weeping Wall)

There’s an organisation I had been dealing with on and off over the years who were having all sorts of issues with their Siebel System and who were totally convinced their performance issues were due directly to being forced to use the FIRST_ROWS_10 optimizer. I’ve attempted on a number of occasions to explain that their […]