Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Bait and Switch

Just what you need after a long hike in the Lake District:
https://jonathanlewis.files.wordpress.com/2018/11/baitnswitch1.jpg?w=113... 113w, https://jonathanlewis.files.wordpress.com/2018/11/baitnswitch1.jpg?w=225... 225w" sizes="(max-width: 603px) 100vw, 603px" />

You need 10 minutes! Not 5, not 15, not 7

What follow is just my opinion….Well duh, it’s a blog, of course it’s just my opinion, but I’ll try back up my rantings and pontificating with some reasoned arguments. Whether at the end of it, you still call my claims total BS is of course entirely within your rights and your opinion </p />
</p></div>

    	  	<div class=

Terraforming the Oracle Cloud: choosing and using an image family

For a few times now I have presented about “cloud deployments done the cloud way”, sharing lessons learned in the changing world I find myself in. It’s a lot of fun and so far I have been far too busy to blog about things I learned by trial and error. Working with Terraform turned out to be a very good source for blog posts, I’ll put a few of these up in the hope of saving you a few minutes.

This blog post is all about creating Ubuntu images in Oracle Cloud Infrastructure (OCI) using terraform. The technique is equally applicable for other Linux image types though. In case you find this post later using a search engine, here is some version information that might put everything into context:

$ ./terraform version
Terraform v0.11.10
+ provider.null v1.0.0
+ provider.oci v3.7.0

I used the “null” provider to run a few post-installation commands as shown in various terraform examples for OCI. Right now I’m trying to work out if I can’t do the same in a different way. If I am successful you can expect a blog post to follow…

Creating a Ubuntu 18.04 LTS image in OCI

To create the Ubuntu image (or any other image for that matter), I need information about the image family. Documentation about image families in OCI can be found at https://docs.cloud.oracle.com/iaas/images/.

Scrolling down/selecting the entry from the left hand side I found the link to the Ubuntu 18.04 LTS image family. Each supported image has its own documentation link, containing crucial data: an OCID per location. At the time of writing, the latest Ubuntu image was named Canonical-Ubuntu-18.04-2018.11.17-0 and had an image OCID of ocid1.image.oc1.eu-frankfurt-1.aaaa...i57q7bfsa. An OCID is short for Oracle Cloud Identifier and it’s used in many places in OCI. There are different OCIDs for the image depending on location; the (shortened) OCID I just showed you was for Frankfurt.

With the OCID at hand, I can open my favourite code editor and start putting the terraform script together. I create instances in OCI using the oci_core_instance type, documented at the terraform website.

Be careful, many of the references and code examples I found about oci_core_image are written for older versions of the terraform provider. I noticed some attributes used in the examples are deprecated. It might be useful to compare the source code examples against the current documentation

Part of the definition of an oci_core_instance requires the specification of the operating system in the source_details {} section. To create the Ubuntu VM in the Frankfurt region, I have to specify – amongst other things of course – this:

resource "oci_core_instance" "docker_tf_instance" {
...
    source_details {
        source_type = "image"
        source_id   = "ocid1.image.oc1.eu-frankfurt-1.aaaa..."
...
    }
...

The actual OCID is far longer, the example above is shortened for the sake of readability. I didn’t like it wrapping around the text box and thus destroying my layout. Make sure you use the correct OCID ;)

With the information at hand I can create the Ubuntu VM and connect to it using the specified SSH key. Have fun!

#DOAG2018 Impressions

As every year in November, the database circus moved to Nuremberg on behalf of the annual DOAG conference. As you may know, this conference has very high standards in terms of turnout and top-notch speakers and it reached them once again: It was a great event!

It started with the welcome keynote where Stefan Kinnen told us that the conference attracted over 2000 attendees and more than 400 speakers from all over the world. That should make it the largest database conference in Europe, I suppose!

https://uhesse.files.wordpress.com/2018/11/doag_keynote.jpeg?w=1240&h=698 1240w, https://uhesse.files.wordpress.com/2018/11/doag_keynote.jpeg?w=150&h=84 150w, https://uhesse.files.wordpress.com/2018/11/doag_keynote.jpeg?w=300&h=169 300w, https://uhesse.files.wordpress.com/2018/11/doag_keynote.jpeg?w=768&h=432 768w, https://uhesse.files.wordpress.com/2018/11/doag_keynote.jpeg?w=1024&h=576 1024w" sizes="(max-width: 620px) 100vw, 620px" />

DOAG welcome keynote (Stefan Kinnen speaking)

I went on to attend Tirthankar Lahiri who presented about the In-Memory column store:

To me, the In-Memory expressions and In-Memory external tables sounded particular useful here.

Next talk was done by Martin Widlake about Performance Tuning:

https://uhesse.files.wordpress.com/2018/11/martin.jpg?w=150&h=84 150w, https://uhesse.files.wordpress.com/2018/11/martin.jpg?w=300&h=169 300w, https://uhesse.files.wordpress.com/2018/11/martin.jpg?w=768&h=432 768w, https://uhesse.files.wordpress.com/2018/11/martin.jpg 1024w" sizes="(max-width: 620px) 100vw, 620px" />

Martin Widlake walking on stage

I liked his statement that in order to solve performance problems, the key skill you need is not technical in the first place, but more of a social nature: You need to thoroughly understand the problem and listen carefully. Secondly, some technical expertise is helpful, of course </p />
</p></div>

    	  	<div class=

Happy Thanksgiving!

Just a quick blog post from Wellington New Zealand where we have just wrapped up the 2018 APAC Groundbreakers tour. It was a great way to finish the event with a small but enthusiastic crowd here in New Zealand.

image

Obviously our American counterparts in the Oracle community have other things on their mind this weekend, celebrating Thanksgiving. And without trying to be tooooo corny and cheesy about it Smile, the thanks I give is for 18c XE! A free database for all – the perfect the springboard for developers and DBAs everywhere to explore, learn and experiment with the Oracle database.

18cxe

This is why I mentioned the APAC tour initially in this post. Whilst many will use 18c XE for a fully fledged database implementation for their free applications, another benefit I see of XE is that it can be catalyst for all of us to build and learn and then share our experiences with the community, whether it be via blog posts, or videos, or contributing content to your local user group.

So how about making this a pledge for Thanksgiving, or perhaps a new year resolution? Namely, download 18c XE, learn something new about the Oracle Database and then share that with the community. It’s free, it’s easy and we all benefit!

(Thanks to Sandesh for the NZ pic)

AUSOUG “Oracle Master” Award

  I was very flattered and honoured to be named one of the first 4 “Oracle Masters” by the Australian Oracle User Group at the recent AUSOUG Connect 2018 conference in Melbourne. Thank you to all the AUSOUG members involved for this award. As always, it’s a pleasure to help fellow Oracle Database professionals in […]

18c and the ignoring of hints

 

One of the new features in 18c is the ability to ignore any optimizer hints in a session or across the entire database. A motivation for this feature is obviously our own Autonomous Data Warehouse, where we want to optimize queries without the potential “baggage” of user nominated hints strewn throughout the code.

This would seem a fairly easy function to implement, namely, as we parse the SQL, simply rip out anything that is a comment structured as a hint. At the Perth Oracle User Group conference yesterday, I had an interesting question from an attendee – namely, if all optimizer hints are being ignored, then does this mean that every hint will be ignored. In particular, what about the (very useful) QB_NAME hint? If we are just stripping out anything that is in a hint text format, we will lose those as well?

So it’s time for a test!

I’ll start with the default of optimizer hints being respected.


SQL> show parameter optimizer_ignore_hints

NAME                                 TYPE        VALUE
------------------------------------ ----------- -----
optimizer_ignore_hints               boolean     FALSE

Using the standard SCOTT.EMP table, we’ll do a query that by default, will use an indexed access path (after all, we’re doing a simply primary key lookup).


SQL> select * from scott.emp s
  2  where empno = 123;

no rows selected

To prove that an index was used, we’ll examine the just executed query via DBMS_XPLAN.DISPLAY_CURSOR.


SQL> select * from dbms_xplan.display_cursor();

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
SQL_ID  353khf08wdpq1, child number 0
-------------------------------------
select * from scott.emp s where empno = 123

Plan hash value: 4024650034

--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |       |       |     1 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| EMP    |     1 |    37 |     1   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | EMP_PK |     1 |       |     0   (0)|          |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("EMPNO"=123)


19 rows selected.

The next test is to validate that I’m correctly specifying any hints, so I’ll run the same SQL with a FULL hint and double check that the hint is being respected by the optimizer.


SQL> select /*+ FULL(s) */ * from scott.emp s
  2  where empno = 123;

no rows selected

SQL>
SQL> select * from dbms_xplan.display_cursor();

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------
SQL_ID  fynghufd48tkr, child number 0
-------------------------------------
select /*+ FULL(s) */ * from scott.emp s where empno = 123

Plan hash value: 3956160932

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |     3 (100)|          |
|*  1 |  TABLE ACCESS FULL| EMP  |     1 |    37 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("EMPNO"=123)


18 rows selected.

So far so good. Now to see the effect of the new 18c parameter ‘optimizer_ignore_hints’. I’ll set it to true for this session, and then re-execute the previous query that had the FULL hint specified.


SQL> alter session set optimizer_ignore_hints = true;

Session altered.

SQL>
SQL> select /*+ FULL(s) */ * from scott.emp s
  2  where empno = 123;

no rows selected

SQL>
SQL> select * from dbms_xplan.display_cursor();

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
SQL_ID  fynghufd48tkr, child number 1
-------------------------------------
select /*+ FULL(s) */ * from scott.emp s where empno = 123

Plan hash value: 4024650034

--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |       |       |     1 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| EMP    |     1 |    37 |     1   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | EMP_PK |     1 |       |     0   (0)|          |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("EMPNO"=123)


19 rows selected.

You can see that the FULL hint was ignored, and the original index access path has been restored.

Which brings us to the final test that motivated this blog post. What happens when the hint text contains a QB_NAME hint (which we would like to be preserved) and a FULL hint (which we would like to be ignored)?


SQL>
SQL> select /*+ QB_NAME(MY_QUERY_NAME) FULL(s) */ * from scott.emp s
  2  where empno = 123;

no rows selected

SQL>
SQL> select * from dbms_xplan.display_cursor(format=>'ALL');

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------
SQL_ID  9kg1866kgdmhs, child number 0
-------------------------------------
select /*+ QB_NAME(MY_QUERY_NAME) FULL(s) */ * from scott.emp s where
empno = 123

Plan hash value: 4024650034

--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |       |       |     1 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| EMP    |     1 |    37 |     1   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | EMP_PK |     1 |       |     0   (0)|          |
--------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

   1 - MY_QUERY_NAME / S@MY_QUERY_NAME
   2 - MY_QUERY_NAME / S@MY_QUERY_NAME

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("EMPNO"=123)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

   1 - "EMPNO"[NUMBER,22], "S"."ENAME"[VARCHAR2,10], "S"."JOB"[VARCHAR2,9],
       "S"."MGR"[NUMBER,22], "S"."HIREDATE"[DATE,7], "S"."SAL"[NUMBER,22],
       "S"."COMM"[NUMBER,22], "S"."DEPTNO"[NUMBER,22]
   2 - "S".ROWID[ROWID,10], "EMPNO"[NUMBER,22]


34 rows selected.

Using the extended format of “ALL”, you can see that the QB_NAME information was preserved even though the FULL hint was ignored as desired.

Thus it is not just a simple “rip out all of the hint text” implementation.

Oracle Privilege Analysis Now Free in EE from 18c and back ported to all 12c

Wow!!, i just got an email from someone in Oracle to let me know that the Privilege Analysis feature of Database Vault has had its licensing changed from this week to now be free as part of an Enterprise Edition....[Read More]

Posted by Pete On 20/11/18 At 10:06 PM

Table order

Over the last few days I’ve highlighted on Twitter a couple of older posts showing how a change in the order that tables appear in the from clause could affect the execution plan of a query. In one case the note was purely theoretical describing a feature of the way the optimizer works with simple query blocks, in the other case the note was about an anomaly with table elimination that could appear with both “ANSI” and “traditional” Oracle syntax.

Here’s another note that might be more generally useful – an example of an odd side effect of ordering and “ANSI” syntax, with a suggestion for a pattern for writing ANSI SQL. It’s based on a test I wrote to play around with a problem that showed up on the Oracle database forum more than six years ago and shows a strange inconsistency. The setup is a little long-winded as the example involves 4 tables, so I’ll leave the script to create, load and index the tables to the end of the note. Here’s the query that introduced the problem; it’s a fairly straightforward 4 table join with two (left) outer joins:


select
        episode.episode_id , episode.cross_ref_id , episode.date_required ,
        product.number_required,
        request.site_id
from
        episode
left join
        request
on      episode.cross_ref_id = request.cross_ref_id
join
        product
ON      episode.episode_id = product.episode_id
left join
        product_sub_type
ON      product.prod_sub_type_id = product_sub_type.prod_sub_type_id
where
        episode.department_id = 2
and     product.status = 'I'
order by
        episode.date_required
;

And here’s the execution plan:


----------------------------------------------------------------------------------------
| Id  | Operation            | Name    | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |         | 33333 |  1725K|       | 17135   (4)| 00:00:01 |
|   1 |  SORT ORDER BY       |         | 33333 |  1725K|  2112K| 17135   (4)| 00:00:01 |
|*  2 |   HASH JOIN OUTER    |         | 33333 |  1725K|  1632K| 16742   (4)| 00:00:01 |
|*  3 |    HASH JOIN         |         | 33333 |  1236K|       |   436   (8)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL| PRODUCT | 33333 |   325K|       |    54  (12)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL| EPISODE |   300K|  8203K|       |   375   (6)| 00:00:01 |
|   6 |    TABLE ACCESS FULL | REQUEST |  4000K|    57M|       | 13542   (3)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("EPISODE"."CROSS_REF_ID"="REQUEST"."CROSS_REF_ID"(+))
   3 - access("EPISODE"."EPISODE_ID"="PRODUCT"."EPISODE_ID")
   4 - filter("PRODUCT"."STATUS"='I')
   5 - filter("EPISODE"."DEPARTMENT_ID"=2)

The first thing you’ll notice, of course, is that the plan reports a three table join. Thanks to various referential integrity constraints, the absence of the table in the final select list, and the nature of the join to that table, the optimizer has determined that the product_sub_type table could be eliminated from the join without changing the result set.

What you can’t tell from the plan is that there’s an index on the request table that holds all the columns needed to satisfy the query, and an index fast full scan on the index would be significantly more efficient than the tablescan that appears at operation 6.

Having noticed from the plan that product_sub_type is redundant, the obvious thing to do before investigating further is to rewrite the statement to remove the table . Here’s the resulting query, with execution plan:

----------------------------------------------------------------------------------------------
| Id  | Operation              | Name        | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |             | 33333 |  1725K|       |  5525   (6)| 00:00:01 |
|   1 |  SORT ORDER BY         |             | 33333 |  1725K|  2112K|  5525   (6)| 00:00:01 |
|*  2 |   HASH JOIN OUTER      |             | 33333 |  1725K|  1632K|  5132   (7)| 00:00:01 |
|*  3 |    HASH JOIN           |             | 33333 |  1236K|       |   436   (8)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL  | PRODUCT     | 33333 |   325K|       |    54  (12)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL  | EPISODE     |   300K|  8203K|       |   375   (6)| 00:00:01 |
|   6 |    INDEX FAST FULL SCAN| IX4_REQUEST |  4000K|    57M|       |  1932   (7)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("EPISODE"."CROSS_REF_ID"="REQUEST"."CROSS_REF_ID"(+))
   3 - access("EPISODE"."EPISODE_ID"="PRODUCT"."EPISODE_ID")
   4 - filter("PRODUCT"."STATUS"='I')
   5 - filter("EPISODE"."DEPARTMENT_ID"=2)

So – when the optimizer removes the product_sub_type from the query the plan reports a tablescan of request, when we remove product_sub_type the plan reports an index fast full scan of an appropriate index – which appears to be roughly one seventh (1,932/13,542) of the size of the table. It’s a little surprising that the optimizer didn’t get it right by itself – but “ANSI” style SQL often displays quirky little side effects because of the way the optimizer transforms it into traditional Oracle style.

We could stop at that point, of course, but then you’d wonder about the significance of the title of the post. So let’s play around with the join order of the original query, without removing the product_sub_type table.

As a general strategy (though not an absolute rule) I tend to arrange code so that outer joins don’t appear before “inner” joins. In this example that means I would have written the original statement as follows:


select
        episode.episode_id, episode.cross_ref_id, episode.date_required,
        product.number_required,
        request.site_id
from
        episode
join
        product
ON      product.episode_id = episode.episode_id
left join
        product_sub_type
ON      product_sub_type.prod_sub_type_id = product.prod_sub_type_id
left join
        request
on      request.cross_ref_id = episode.cross_ref_id
where
        episode.department_id = 2
and     product.status        = 'I'
order by
        episode.date_required
;

All I’ve done is move the join between episode and product up the SQL, following it with the outer join to product_sub_type, finally closing with the outer join between episode and request. Here’s the execution plan – which you might expect to look exactly like the original plan:


----------------------------------------------------------------------------------------------
| Id  | Operation              | Name        | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |             | 33333 |  1725K|       |  5525   (6)| 00:00:01 |
|   1 |  SORT ORDER BY         |             | 33333 |  1725K|  2112K|  5525   (6)| 00:00:01 |
|*  2 |   HASH JOIN OUTER      |             | 33333 |  1725K|  1632K|  5132   (7)| 00:00:01 |
|*  3 |    HASH JOIN           |             | 33333 |  1236K|       |   436   (8)| 00:00:01 |
|*  4 |     TABLE ACCESS FULL  | PRODUCT     | 33333 |   325K|       |    54  (12)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL  | EPISODE     |   300K|  8203K|       |   375   (6)| 00:00:01 |
|   6 |    INDEX FAST FULL SCAN| IX4_REQUEST |  4000K|    57M|       |  1932   (7)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("REQUEST"."CROSS_REF_ID"(+)="EPISODE"."CROSS_REF_ID")
   3 - access("PRODUCT"."EPISODE_ID"="EPISODE"."EPISODE_ID")
   4 - filter("PRODUCT"."STATUS"='I')
   5 - filter("EPISODE"."DEPARTMENT_ID"=2)

The product_sub_type table has been eliminated and we’re doing an index fast full scan of the ix4_request index instead of a tablescan of the much larger request table.

tl;dr

Changing the order of the tables in an ANSI join – especially when there are outer joins involved – could make a significant difference to the way the query is transformed and optimised. While it is nice to write the table ordering so that “chains” of joins are easily visible, bear in mind that re-ordering the join to postpone outer joins may be enough to help the optimizer produce a better execution plan.

Footnote

If you want to play around with the example, here’s the code to create and load the tables. The code doesn’t follow my usual style as most of it is cut-n-pasted from the Oracle forum thread:


rem
rem     script:         Ansi_outer_5.sql
rem     Dated:          July 2012
rem     Author:         Jonathan Lewis
rem
rem     Last tested
rem             18.3.0.0        iffs still not used by default
rem             12.2.0.1        iffs still not used by default
rem

create table episode (
        episode_id number (*,0),
        department_id number (*,0),
        date_required date,
        cross_ref_id varchar2 (11),
        padding varchar2 (80),
        constraint pk_episode primary key (episode_id)
)
;

create table product_sub_type (
        prod_sub_type_id number (*,0),
        sub_type_name varchar2 (20),
        units varchar2 (20),
        padding varchar2 (80),
        constraint pk_product_sub_type primary key (prod_sub_type_id)
)
;

create table product (
        product_id number (*,0),
        prod_type_id number (*,0),
        prod_sub_type_id number (*,0),
        episode_id number (*,0),
        status varchar2 (1),
        number_required number (*,0),
        padding varchar2 (80),
        constraint pk_product primary key (product_id),
        constraint nn_product_episode check (episode_id is not null) 
)
;

alter table product add constraint fk_product 
        foreign key (episode_id) references episode (episode_id)
;

alter table product add constraint fk_prod_sub_type
        foreign key (prod_sub_type_id) references product_sub_type (prod_sub_type_id)
;

create table request (
        request_id number (*,0),
        department_id number (*,0),
        site_id number (*,0),
        cross_ref_id varchar2 (11),
        padding varchar2 (80),
        padding2 varchar2 (80),
        constraint pk_request primary key (request_id),
        constraint nn_request_department check (department_id is not null),
        constraint nn_request_site_id check (site_id is not null)
)
;

prompt  ===================
prompt  Loading episode ...
prompt  ===================

insert /*+ append */ into episode
with generator as 
(select rownum r
          from (select rownum r from dual connect by rownum <= 1000) a,
               (select rownum r from dual connect by rownum <= 1000) b,
               (select rownum r from dual connect by rownum <= 1000) c
         where rownum <= 1e6
       ) 
select r, 2,
    sysdate + mod (r, 14),
    to_char (r, '0000000000'),
    'ABCDEFGHIJKLMNOPQRSTUVWXYZ' || to_char (r, '000000')
  from generator g
where g.r <= 3e5
/ 

commit;

prompt  ============================
prompt  Loading product_sub_type ...
prompt  ============================

insert /*+ append */ into product_sub_type
with generator as 
(select rownum r
          from (select rownum r from dual connect by rownum <= 1000) a,
               (select rownum r from dual connect by rownum <= 1000) b,
               (select rownum r from dual connect by rownum <= 1000) c
         where rownum <= 1e6
       ) 
select r, 
       to_char (r, '000000'),
       to_char (mod (r, 3), '000000'),
       'ABCDE' || to_char (r, '000000')
  from generator g
where g.r <= 15
/ 

commit;

prompt  ===================
prompt  Loading product ...
prompt  ===================

insert /*+ append */ into product
with generator as 
(select rownum r
          from (select rownum r from dual connect by rownum <= 1000) a,
               (select rownum r from dual connect by rownum <= 1000) b,
               (select rownum r from dual connect by rownum <= 1000) c
         where rownum <= 1e6
       ) 
select r, mod (r, 12) + 1, mod (r, 15) + 1, mod (r, 300000) + 1,
       decode (mod (r, 3), 0, 'I', 1, 'C', 2, 'X', 'U'),
       dbms_random.value (1, 100), NULL
  from generator g
where g.r <= 1e5
/ 

commit;

prompt  ===================
prompt  Loading request ...
prompt  ===================

insert /*+ append */ into request
with generator as 
(select rownum r
          from (select rownum r from dual connect by rownum <= 1000) a,
               (select rownum r from dual connect by rownum <= 1000) b,
               (select rownum r from dual connect by rownum <= 1000) c
         where rownum <= 1e7
       ) 
select 
        r, mod (r, 4) + 1, 1, to_char (r, '0000000000'),
        'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz01234567890123456789' || to_char (r, '000000'),
        'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789012345678' || to_char (r, '000000')
  from generator g
where g.r <= 4e6
/ 

commit;

create index ix1_episode_cross_ref on episode (cross_ref_id);

create index ix1_product_episode on product (episode_id);
create index ix2_product_type on product (prod_type_id);

create index ix1_request_site on request (site_id);
create index ix2_request_dept on request (department_id);
create index ix3_request_cross_ref on request (cross_ref_id);
create index ix4_request on request (cross_ref_id, site_id);

exec dbms_stats.gather_schema_stats ('test_user')

Note that there is a call to gather_schema_stats() at the end, rather than a set of 4 calls to gather_table_stats(); you may want to change this. The entire data set, including indexes, will need about 1.5GB of free space.

 

How Not to Build A(n Autonomous) Data Warehouse

My day job involves investigating and resolving performance problems, so I get to see a lot of bad stuff.  Often, these problems have their roots in poor design.  It is not surprising. but is nonetheless disappointing, that when I point this out I am told that the system is either delivered this way by the vendor, or it has already been built and it is too late to change.
In the last couple of years, I have worked on several data warehouse applications that have provided the inspiration for a new presentation that I am giving at the DOAG and UKOUG conferences this year.
The presentation and this series of related blogs have several objectives:

  • Partly, it is an attempt to get some advice out in the hope that some of these poor design decisions are not made in the first place. 
  • I get to have a bit of a rant about how to design a data warehouse properly, and I will feel better at the end it.
  • I get to attend two really good conferences where I will learn (or be reminded of) much.

This particular blog post is an index to various topics within the presentation, that I have turned into separate blog postings.

Oracle Sales History (SH) Sample Schema

I cannot illustrate issues with actual examples from client systems, no matter how well anonymised they are. So, I have recreated some of the most egregious mistakes using the Oracle Sales History example schema, SH.  Everything I am going to show you does have an origin in the real world!  The SH sample schema is a simple and very typical star schema.  It is well designed and implemented, and represents good practice. I encourage you to install and play with it for yourself.
My examples use a single fact table (SALES), that has five dimensions around it (CUSTOMERS, CHANNELS, PRODUCTS, PROMOTIONS and TIMES), although I will only use three. COUNTRIES is a dimension on the CUSTOMERS dimension, sometimes called a 'snowflake'.

In some of my examples, I have deliberately broken the data model in exactly the same way that I have seen it broken in real life. So, if you find yourself saying "nobody would actually do that!", let me assure you that they did!

Common Mistakes in Data Warehouse Design and Build

As each blog post is published, I will add a link to them in this section.