Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Friday Philosophy – Are Leaving Presentations A Quaint British Tradition?

A few days ago a friend, Robert Lockard, started a discussion on Facebook about bad bosses and the strange things they did. I mentioned how one of my first bosses had refused to do my leaving presentation, arguing with his boss (very loudly so all could hear, despite it being in “an office” – a plastic box in the corner) that he did not want to be nice to me and do the presentation. Neither did his boss!

https://mwidlake.files.wordpress.com/2017/08/screenhunter_172-aug-10-21-... 920w, https://mwidlake.files.wordpress.com/2017/08/screenhunter_172-aug-10-21-... 150w, https://mwidlake.files.wordpress.com/2017/08/screenhunter_172-aug-10-21-... 300w, https://mwidlake.files.wordpress.com/2017/08/screenhunter_172-aug-10-21-... 768w" sizes="(max-width: 460px) 100vw, 460px" />

Another friend of mine, Jeff Smith (oh he of SQL*Developer fame) replied to my comment saying “what is a ‘leaving presentation’ – they let you get up in front of everyone and invite them to kiss your a$$ goodbye? Because, that sounds pretty amazing”.
That took me by surprise, it had never occurred to me that “leaving presentations” were not universal. That’s one of the great things about the global reach of social media, it helps you realise that so many things you thing are normal are, in fact, local to your region and are seen as bizarre by others in other cultures.

So that made me wonder how common “leaving presentations” are? I could have asked on Twitter or mailed a few friends, but I have this blog thing I can use…

I suppose I should describe what a “leaving presentation” is, in case other cultures do the same but call it something else (or just do it without a title). So:-

For most of my working life in the UK, if you are a permanent member of staff and it is known you are leaving (so it’s not a case of you being sacked) and it is only you (so it’s not a case of the company slashing the workforce) then “Shirley from Accounts” will take charge and will buy a card. I don’t know why, but it is nearly always a woman who gets the card and the same woman organises most people’s leaving card. The card is sent around the office in an envelope and people sign it (maybe adding some words like “begone foul demon”) and, depending on the organisations, there may be a collection made at the same time. It is beholden on you, the leaver, to pretend to never notice the card going around (or checking how the collection is going. I did know one guy who put money into his own collection to make himself seem more popular!).

Back in the 1980’s/90’s, the places I worked at did not have email – not even internal systems on the mainframe, so lots of envelopes would be going around with company memos or things you had to read and sign you had read. So the odd card going around was easy to ignore. These days of course everything is email so the last few times I’ve noticed a card going around, it stuck out like a sore thumb and you knew it was a leaving (or “congratulations” or “get well soon”) card.

Then on your last day your boss gets the team around, (s)he says you did not steal much and you did not piss off all the users, they give you the card and pretend to care what happens to you in the future. People then clap politely. This not the US, there is no whooping or saying it is the saddest day of their lives. If a collection had been made they will have bought you something with the collection. It is always almost, but not quite totally, useless. You might have to do a small speech and then, the best bit, you take them down the pub and buy everyone a drink (it used to happen at lunch time but now it tends to be more at the end of the day). The round generally costs you more than the collection they gathered for you. Sadly the last bit seems to be dying out.

As a contractor/external consultant you tend to avoid the mild discomfort of it all as you are not around long enough to become part of the team and, well, it’s just not done for over-paid contractors.

Personally, I have always found the whole thing a bit weird and, if I am the leaver, mildly uncomfortable. I try to avoid the whole thing by keeping my exit quiet or stealing enough stationary so that management do not feel I deserve a leaving presentation. Of course, in the case I cite above, I nearly avoided it just by making my bosses hate me. ho Hum.

But I do still try and do one bit, the “taking people to the pub” at the end of the day, even when I am a contractor and we are not supposed to get leaving presentations.

So what if anything do they do where you are? Is the leaver expected to do something (bring in cake, kiss everyone, do a dance)? Do you have a tradition that is eminently sensible and common in your country but, not you you come to think of it, maybe it’s a touch strange? Or do people just leave quietly and no one notices much – except for the scramble for the chair or your higher-res screen?

Latest Delphix Blog Post and Upcoming Webinars

In my latest blog post on the Delphix site, I continue my conversation with why DevOps is the next step for DBAs and how DBAs can embrace this next step in their evolution.

This is an extensive series of blog posts, (four so far) to be followed by an ebook, a podcast and two webinars.  One is to be announced soon from Oracle called, “The DBA Diaries” and the other will be a from Delphix, titled, “The Revolution:  From Databases and DevOps to DataOps“.

The goal for all of this is to ease transition for the Database community as the brutal shift to the cloud, now underway, changes our day to day lives.  Development continues to move at an ever accelerating pace and yet the DBA is standing still, waiting for the data to catch up with it all.  This is a concept that many refer to as “data gravity“.

https://i1.wp.com/dbakevlar.com/wp-content/uploads/2017/08/Screen-Shot-2... 300w, https://i1.wp.com/dbakevlar.com/wp-content/uploads/2017/08/Screen-Shot-2... 768w, https://i1.wp.com/dbakevlar.com/wp-content/uploads/2017/08/Screen-Shot-2... 1490w, https://i1.wp.com/dbakevlar.com/wp-content/uploads/2017/08/Screen-Shot-2... 1200w" sizes="(max-width: 600px) 100vw, 600px" data-recalc-dims="1" />

The concept was first coined just a few years ago by a Senior VP Platform Engineer, Dave McCrory.  It was an open discussion aimed at understanding how data impacted the way technology changed when connected with network, software and compute.

He discusses the basic understanding that there’s a limit in “the speed with which information can get from memory (where data is stored) to computing (where data is acted upon) is the limiting factor in computing speed.” called the Von Newmann Bottleneck.

These are essential concepts that I believe all DBAs and Developers should understand, as data gravity impacts all of us.  Its the reason for many enhancements to database, network and compute power.  Its the reason optimization specialists are in such demand.  Other roles such as backup, monitoring and error handling can be automated, but the more that we drive logic into programs, nothing is as good as true skill in optimization when it comes to eliminating much of data gravity issues.  Less data, less weight-  it’s as simple as that.

We all know the cloud discussions are coming, and with that, even bigger challenges are felt by the gravity from data.  Until then, let’s just take a step back and recognize that we need some new goals and some new skills.  If you’re like to learn more about data gravity, but don’t have time to take it all in at once, consider following it on Twitter, which is curated by Dave McCrory.

I’m off to Jacksonville, Fl. tomorrow to speak at SQL Saturday #649!

https://i0.wp.com/dbakevlar.com/wp-content/uploads/2017/08/Screen-Shot-2... 300w" sizes="(max-width: 260px) 100vw, 260px" data-recalc-dims="1" />

 

 

 



Tags:  , , ,


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBA Kevlar [Latest Delphix Blog Post and Upcoming Webinars], All Right Reserved. 2017.

The post Latest Delphix Blog Post and Upcoming Webinars appeared first on DBA Kevlar.

Installation of Rundeck with the Ansible plugin on Centos 7

This post shows you how to install Rundeck with the Ansible plugin on Centos 7. The installation is done with nginx as the web server and using SSL with a self signed certificate. Please read the Ansible installation script, and modify anything that should be different for your situation. You will be amazed how well readable the installation script is!

Rundeck is a web based user interface that allows you to run commands against a group of hosts. Rundeck has an ansible plugin. Using that plugin, it could perform the similar functionality as Ansible Tower (commercial product) or Semaphore (open source).

After a fresh installation of Centos 7, do the following as root:

yum -y localinstall https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
yum -y install ansible git
git clone https://gitlab.com/FritsHoogland/install_rundeck.git

This adds the EPEL (extra packages for Enterprise Linux) repository to Centos, which contains ansible and git. The next yum command installs ansible (the orchestration software) and git (distributed version control). The git clone command pulls the ansible orchestration script from my install_rundeck git repository at gitlab.

It is important to open the install_rundeck/install_rundeck.yml script and modify anything that is different in your situation. The public hostname will most likely be different than the 192.168.66.16 in the script. You might want to change the locality information with the certificate generation (unless you live in Lutjebroek like me :-). If you do a corporate installation, you might want to skip this step altogether and get a certificate pair signed by your company’s CA.

Please mind I ran into issues with ansible inventories:
– The hosts in the inventory need to have ansible run against them to pick up their properties and become visible in rundeck in the nodes tab. For being able to have ansible run against the hosts in the inventory, the host need to exist in the ssh known hosts file of the rundeck user, otherwise ansible can’t run and the host or hosts will not be visible in rundeck. The solution is to log in as the rundeck user and logon to the hosts in the inventory once manually and accept the host or hosts. From a security perspective it’s understandable that you careful need to assess the hosts to trust, but from an automation standpoint this is annoying. Outside of essentially filling out the known hosts file as I described, there are several other workarounds.
– I created an ansible inventory file in the rundeck project directory and entered the hosts in it. Rundeck picked up the hosts (after solving the above point they became visible in the nodes tab), however when executing something using ansible via rundeck it would say ‘[WARNING]: provided hosts list is empty, only localhost is available’. This means ansible was not pointed explicitly to an inventory, so it used the default one. In order to solve this, I symlinked my (rundeck) project inventory to the /etc/ansible/hosts to make it centrally available. Apparently, using a central inventory for ansible using the plugin is by design. I would rather have rundeck generate an inventory per execution, and pointing to it when the plugin executes ansible.

Now install rundeck:

ansible-playbook install_rundeck/install_rundeck.yml

Done!

Tagged: automation, devops, linux, oracle, rundeck

Postgres vs. Oracle access paths VI – Index Scan

In the previous post my queries were still reading the indexed column only, from a table which had no modifications since the last vacuum, and then didn’t need to read table pages: it was Index Only Scan. However, we often need more columns than the ones that are in the index. Here is the Index Scan access path.

I’m continuing on the table that I’ve created in the first post of the series. I’ve run VACUUM (the lazy one, not the full one) and did not do any modification after that, as we have seen that Index Only Access is efficient only when there are no modifications.

create table demo1 as select generate_series n , 1 a , lpad('x',1000,'x') x from generate_series(1,10000);
SELECT 10000
create unique index demo1_n on demo1(n);
CREATE INDEX
vacuum demo1;
VACUUM

I have 10000 rows, a unique column N with decimal numbers, indexed and another column A which is not indexed.

Index Only Scan

I’ll now query one row, the one with N=1000.

explain (analyze,verbose,costs,buffers) select n from demo1 where n=1000 ;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Index Only Scan using demo1_n on public.demo1 (cost=0.29..4.30 rows=1 width=4) (actual time=0.123..0.124 rows=1 loops=1)
Output: n
Index Cond: (demo1.n = 1000)
Heap Fetches: 0
Buffers: shared hit=3
Planning time: 0.625 ms
Execution time: 0.137 ms

It seems that the query planner estimates to read one block:

  • The startup cost of 0.29 as we have seen before
  • Read one index page, cost=4 (random_page_cost=4)
  • 1 result row to process, estimated at cpu_tuple_cost=0.01

As the index is a B*Tree with 30 pages, I expect to read at least one branch in addition to the leaf block. The execution has actually read 3 blocks (Buffers: shared hit=3). Here it seems that Postgres decides to ignore the branches and count only the leaf blocks.

In Oracle, the estimation cost=1 and execution has read 2 blocks:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID gusay436hpzck, child number 0
-------------------------------------
select /*+ */ n from demo1 where n=1000
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 (100)| 1 |00:00:00.01 | 2 |
|* 1 | INDEX UNIQUE SCAN| DEMO1_N | 1 | 1 | 1 (0)| 1 |00:00:00.01 | 2 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("N"=1000)
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "N"[NUMBER,22]

Both Oracle and Postgres read only the index here. This is the fastest access to one indexed column: no need to read the table because the column is in the index. The use-case is quite limited here: just testing the existence of the column. I will now select another column than the one used in the where clause.

Select another column

I filter on N but now query the column A which is not in the index. The Index Only Scan changes to an Index Scan:

explain (analyze,verbose,costs,buffers) select a from demo1 where n=1000 ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Index Scan using demo1_n on public.demo1 (cost=0.29..8.30 rows=1 width=4) (actual time=0.010..0.010 rows=1 loops=1)
Output: a
Index Cond: (demo1.n = 1000)
Buffers: shared hit=3
Planning time: 0.639 ms
Execution time: 0.030 ms

The cost is the same except that there is one additional page to read, which pushes it to cost=8.30:

  • The startup cost of 0.29 as we have seen before
  • Read one index page, and one table page: cost=8 (random_page_cost=4)
  • 1 result row to process, estimated at cpu_tuple_cost=0.01

In Oracle it is not a different operation. We still have the INDEX UNIQUE SCAN, but in addition to it, an additional operation to read the table: TABLE ACCESS BY INDEX ROWID. The index entry returns the ROWID (physical address of the table block, equivalent to the Postgres TID). And then we have the detail of the cost, and execution buffer reads: one more block.

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 8q4tcxgk1n1vn, child number 0
-------------------------------------
select /*+ */ a from demo1 where n=1000
--------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 2 (100)| 1 |00:00:00.01 | 3 |
| 1 | TABLE ACCESS BY INDEX ROWID| DEMO1 | 1 | 1 | 2 (0)| 1 |00:00:00.01 | 3 |
|* 2 | INDEX UNIQUE SCAN | DEMO1_N | 1 | 1 | 1 (0)| 1 |00:00:00.01 | 2 |
--------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("N"=1000)
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "A"[NUMBER,22]
2 - "DEMO1".ROWID[ROWID,10]

The important thing here is within the predicate information where we see the part of the where clause which is not a filter applied after the scan, but is used for optimal access by the index. It is displayed as access() in Oracle execution plan:

access("N"=1000)

In PostgreSQL execution plan, the same information is displayed as ‘Index Cond':

Index Cond: (demo1.n = 1000)

Postgres Range Scan

That was retrieving only one row with an equality predicate on a unique index column. The index scan helps to get directly to the value because of the B*Tree structure. As the index is sorted, an inequality predicate can also use the index to find the rows in a range of values.

The Postgres plan looks the same, with Index Scan:

explain (analyze,verbose,costs,buffers) select a from demo1 where n<=1000 ;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Index Scan using demo1_n on public.demo1 (cost=0.29..175.78 rows=1000 width=4) (actual time=0.029..0.780 rows=1000 loops=1)
Output: a
Index Cond: (demo1.n <= 1000)
Buffers: shared hit=147
Planning time: 1.019 ms
Execution time: 0.884 ms

Same plan but of course we have more index blocks to scan, and more rows to fetch from the table, which is why the cost is higher.

In order to understand the cost, I’ve changed the query planner constants one by one. Here is what I got:

  • (cost=0.29..33.78 rows=1000 width=4) when seq_page_cost=0 instead of 1, which means that it estimates (175.78-33.78)/1=142 sequential reads
  • (cost=0.29..159.78 rows=1000 width=4) when random_page_cost=0 instead of 4, which means that it estimates (175.78-159.78)/4=4 random reads
  • (cost=0.29..165.78 rows=1000 width=4) when cpu_tuple_cost=0 instead of 0.01, which means that it estimates (175.78-165.78)/0.01=1000 rows
  • (cost=0.29..170.78 rows=1000 width=4) when cpu_index_tuple_cost=0 instead of 0.005, which means that it estimates (175.78-170.78)/0.005=1000 index entries
  • (cost=0.00..173.00 rows=1000 width=4) when cpu_operator_cost=0 instead of 0.0025, which means that it estimates (175.78-173.00)/0.0025=1112 cpu operations (116 for initial cost + 996 to get all rows)

I understand the 4 random reads from the index pages. However, I expected random reads, and not sequential reads, to fetch the rows from the table. But this is a case where the clustering factor is very good: the rows have been inserted in the same order as the indexed column, and this means that those reads from table probably read consecutive pages.

In order to validate this guess, I’ve traced the system calls on Linux

25734 open("base/12924/42427", O_RDWR) = 42
25734 lseek(42, 0, SEEK_END) = 11706368
25734 open("base/12924/42433", O_RDWR) = 43
25734 lseek(43, 0, SEEK_END) = 245760

The file descriptor 42 is my table (demo1) and the descriptor 43 is the index (demo1_n). The file name is in the open() call and it includes the file id:

select relname,relfilenode from pg_class where relname='demo1';
-[ RECORD 1 ]--+------
relname | demo1
relfilenode | 42427
 
select relname,relfilenode from pg_class where relname='demo1_n';
-[ RECORD 1 ]--+--------
relname | demo1_n
relfilenode | 42433

Then we see some random reads from the index (branches and first leaf):

25734 lseek(43, 0, SEEK_SET) = 0
25734 read(43, "100036037360374 b152"..., 8192) = 8192
25734 lseek(43, 24576, SEEK_SET) = 24576
25734 read(43, "121000836360374 35023720330237 "..., 8192) = 8192
25734 lseek(43, 8192, SEEK_SET) = 8192
25734 read(43, "13245t360374 211 340237 "..., 8192) = 8192

Then we see 53 reads from the table:

25734 lseek(42, 0, SEEK_SET) = 0
25734 read(42, "40042203 4 36023330103402273010"..., 8192) = 8192
25734 read(42, "40042203 4 36023330103402273010"..., 8192) = 8192
25734 read(42, "40042203 4 36023330103402273010"..., 8192) = 8192
...

Only one lseek. The other reads are all single block (8k) I/O calls but without seek, which means that they are sequential. When relying on filesystem prefetching, this may avoid the latency for each I/O call.

Then the next leaf block from the index is read, and then 52 reads from the table (no lseek):

25734 read(43, "13245t360374 211 340237 "..., 8192) = 8192
25734 read(42, "40042203 4 36023330103402273010"..., 8192) = 8192
25734 read(42, "40042203 4 36023330103402273010"..., 8192) = 8192
25734 read(42, "40042203 4 36023330103402273010"..., 8192) = 8192
...

And again, one index block and 38 contiguous table blocks:

25734 lseek(43, 32768, SEEK_SET) = 32768
25734 read(43, "13245t360374 211 340237 "..., 8192) = 8192
25734 read(42, "40042203 4 36023330103402273010"..., 8192) = 8192
25734 read(42, "40042203 4 36023330103402273010"..., 8192) = 8192
25734 read(42, "40042203 4 36023330103402273010"..., 8192) = 8192
...

Here is the summary of the cost 175.78

  • The startup cost of 0.29 as we have seen before
  • Estimates 4 random reads (reading 1000 rows from a 30 pages index which contains 10000 rows): cost=16 (random_page_cost=4)
  • Estimates 142 sequential reads: cost=142 (seq_page_cost=1)
  • 1000 index entries to process, estimated at cost=5 (cpu_index_tuple_cost=0.005)
  • 1000 result row to process, estimated at cost=10 (cpu_tuple_cost=0.01)
  • about 1000 operators or functions estimated at cpu_operator_cost=0.0025

The very interesting thing here is that the query planner is totally aware of the clustering factor and uses sequential read estimation.

Oracle Range Scan

Here is the same query on the similar table on Oracle:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID a3gqx19xs9wxq, child number 0
-------------------------------------
select /*+ */ a from demo1 where n<=1000
----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 147 (100)| 1000 |00:00:00.01 | 148 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| DEMO1 | 1 | 1000 | 147 (0)| 1000 |00:00:00.01 | 148 |
|* 2 | INDEX RANGE SCAN | DEMO1_N | 1 | 1000 | 4 (0)| 1000 |00:00:00.01 | 4 |
----------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("N"<=1000)
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "A"[NUMBER,22]
2 - "DEMO1".ROWID[ROWID,10]

The straces shows calls to pread:

open("/u01/oradata/CDB1A/PDB/users01.dbf", O_RDWR|O_DSYNC) = 7
fcntl(7, F_SETFD, FD_CLOEXEC) = 0
fcntl(7, F_DUPFD, 256) = 258
fcntl(258, F_SETFD, FD_CLOEXEC) = 0
close(7) = 0
pread(258, "62422313G275"142532'!1?275""..., 8192, 2252800 ) = 8192
pread(258, "62422413C275"14x2432'!1?275""..., 8192, 2260992 ) = 8192
pread(258, "6242313v3362274"24b+1&!1354274""..., 8192, 24731648 ) = 8192
pread(258, "6242314v3362274"24e*1&!1354274""..., 8192, 24739840 ) = 8192
pread(258, "6242315v3362274"24d51&!1354274""..., 8192, 24748032 ) = 8192
pread(258, "6242316v3362274"24g41&!1354274""..., 8192, 24756224 ) = 8192
pread(258, "6242317v3362274"24f71&!1354274""..., 8192, 24764416 ) = 8192
pread(258, "6242320v3362274"24y71&!1354274""..., 8192, 24772608 ) = 8192

pread is similar to lseek()+read() here and, as far as I know, Linux detects when there is no need to seek, and this allows prefetching as well. Oracle has also its own prefetching but I’ll not go into the detail here (read Timur Akhmadeev on Pythian blog about this).

With Oracle, there is no need to run strace because all system calls are instrumented as ‘wait events’ and here is a trace:

PARSE #140375247563104:c=2000,e=1872,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=187737470,tim=53267437268
EXEC #140375247563104:c=0,e=147,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=187737470,tim=53267437481
WAIT #140375247563104: nam='SQL*Net message to client' ela= 4 driver id=1413697536 #bytes=1 p3=0 obj#=74022 tim=53267437532
WAIT #140375247563104: nam='db file sequential read' ela= 8 file#=12 block#=275 blocks=1 obj#=74023 tim=53267437679
WAIT #140375247563104: nam='db file sequential read' ela= 5 file#=12 block#=276 blocks=1 obj#=74023 tim=53267437785
WAIT #140375247563104: nam='db file sequential read' ela= 5 file#=12 block#=3019 blocks=1 obj#=74022 tim=53267437902
FETCH #140375247563104:c=0,e=368,p=3,cr=3,cu=0,mis=0,r=1,dep=0,og=1,plh=187737470,tim=53267437977
WAIT #140375247563104: nam='PGA memory operation' ela= 14 p1=0 p2=0 p3=0 obj#=74022 tim=53267438017
WAIT #140375247563104: nam='SQL*Net message from client' ela= 280 driver id=1413697536 #bytes=1 p3=0 obj#=74022 tim=53267438385
WAIT #140375247563104: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=74022 tim=53267438419
WAIT #140375247563104: nam='db file sequential read' ela= 3 file#=12 block#=3020 blocks=1 obj#=74022 tim=53267438443
WAIT #140375247563104: nam='PGA memory operation' ela= 7 p1=1114112 p2=2 p3=0 obj#=74022 tim=53267438475
WAIT #140375247563104: nam='db file sequential read' ela= 5 file#=12 block#=3021 blocks=1 obj#=74022 tim=53267438504
WAIT #140375247563104: nam='db file sequential read' ela= 3 file#=12 block#=3022 blocks=1 obj#=74022 tim=53267438532
WAIT #140375247563104: nam='db file sequential read' ela= 2 file#=12 block#=3023 blocks=1 obj#=74022 tim=53267438552
WAIT #140375247563104: nam='db file sequential read' ela= 3 file#=12 block#=3024 blocks=1 obj#=74022 tim=53267438576
WAIT #140375247563104: nam='db file sequential read' ela= 4 file#=12 block#=3025 blocks=1 obj#=74022 tim=53267438603
WAIT #140375247563104: nam='db file sequential read' ela= 26 file#=12 block#=3026 blocks=1 obj#=74022 tim=53267438647
WAIT #140375247563104: nam='db file sequential read' ela= 4 file#=12 block#=3027 blocks=1 obj#=74022 tim=53267438680
WAIT #140375247563104: nam='db file sequential read' ela= 2 file#=12 block#=3028 blocks=1 obj#=74022 tim=53267438699
WAIT #140375247563104: nam='db file sequential read' ela= 4 file#=12 block#=3029 blocks=1 obj#=74022 tim=53267438781
WAIT #140375247563104: nam='db file sequential read' ela= 3 file#=12 block#=3030 blocks=1 obj#=74022 tim=53267438807
WAIT #140375247563104: nam='db file sequential read' ela= 28 file#=12 block#=3031 blocks=1 obj#=74022 tim=53267438878
...

The name ‘sequential read’ does not mean the same as the Postgres ‘sequential read’. It only means single-block reads that are done one after the other, but they are actually random reads. However, looking at the block# they appear as reading contiguous blocks.

At the end, because I have an index with good clustering factor, and because I’m using the defaults on Linux without direct read and asynchronous I/O, the execution is very similar to the postgres one: read the few index blocks and follow the pointer to the 140 blocks of the table.

The cost estimation looks similar (same number) between Postgres and Oracle but it is not the same unit. Postgres estimates the cost with sequential reads, but Oracle estimates the cost as random reads. In addition to that, Postgres, with its default planner parameters, gives more importance than Oracle to the CPU usage.

This is the good case of Index Access where we have a good clustering/correlation factor between the physical order of the table and the logical order of the index. The random reads are finally behaving as sequential read because there is no seek() between them. You can imagine that in the next post I’ll try the same with a very bad clustering factor.

 

Cet article Postgres vs. Oracle access paths VI – Index Scan est apparu en premier sur Blog dbi services.

Postgres vs. Oracle access paths V – FIRST ROWS and MIN/MAX

We have seen how an index can help to avoid a sorting operation in the previous post. This avoids a blocking operation: the startup cost is minimal and the first rows can be immediately returned. This is often desired when displaying rows to the user screen. Here is more about Postgres startup cost, Oracle first_rows costing, and fetching first rows only.

Here is the execution plan we had in Oracle to get the values of N sorted. The cost for Oracle is the cost to read the index leaves: estimated to 46 random reads:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID dbck3rgnqbakg, child number 0
-------------------------------------
select /*+ */ n from demo1 where n is not null order by n
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 46 (100)| 10000 |00:00:00.01 | 48 |
| 1 | INDEX FULL SCAN | DEMO1_N | 1 | 10000 | 46 (0)| 10000 |00:00:00.01 | 48 |
---------------------------------------------------------------------------------------------------
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "N"[NUMBER,22]

In PostreSQL, we have two costs (cost=0.29..295.29):

explain (analyze,verbose,costs,buffers) select n from demo1 where n is not null order by n ;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using demo1_n on public.demo1 (cost=0.29..295.29 rows=10000 width=4) (actual time=0.194..2.026 rows=10000 loops=1)
Output: n
Index Cond: (demo1.n IS NOT NULL)
Heap Fetches: 0
Buffers: shared hit=30
Planning time: 1.190 ms
Execution time: 2.966 ms

I explained where the total cost (295.29) comes from:

  • The index on the column X has 30 blocks witch is estimated at cost=120 (random_page_cost=4)
  • We have 10000 index entries to process, estimated at cost=50 (cpu_index_tuple_cost=0.005)
  • We have 10000 result rows to process, estimated at cost=100 (cpu_tuple_cost=0.01)
  • We have evaluated 10000 ‘is not null’ conditions, estimated at cost=25 (cpu_operator_cost=0.0025)

But the Postgres EXPLAIN also show the startup cost (0.29) which is the cost before returning the first rows (only few cpu_operator_cost here).

From that, I can guess that fetching 1 row will have the following cost:

  • The startup cost of 0.29
  • Read the first index page, cost=4 (random_page_cost=4)
  • 1 index entry to process at cpu_index_tuple_cost=0.005
  • 1 result row to process, estimated at cpu_tuple_cost=0.01
  • 1 ‘is not null’ conditions, estimated at cpu_operator_cost=0.0025

This should be approximately cost=4.3075 for one row. Roughly the cost to read one index page. We will see later that the query planner do not count this first index page.

Oracle First Rows

In Oracle, we have only the total cost in the execution plan, but we can estimate the cost to retrieve 1 row with the FIRST_ROWS(1) hint:


PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 0fjk9vv4g1q1w, child number 0
-------------------------------------
select /*+ first_rows(1) */ n from demo1 where n is not null order by
n
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 2 (100)| 10000 |00:00:00.01 | 48 |
| 1 | INDEX FULL SCAN | DEMO1_N | 1 | 10000 | 2 (0)| 10000 |00:00:00.01 | 48 |
---------------------------------------------------------------------------------------------------
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "N"[NUMBER,22]

The cost here is small, estimated to 2 random reads (1 B*Tree branch and 1 leaf) which is sufficient to get the first row. Of course, I’ve estimated it for 1 row but I finally retrieved all rows (A-Rows=10000), reading all blocks (Buffers=48). However, my execution plan is optimized for fetching one row.

Fetch first rows

I can run the previous query and finally fetch only one row, but I can also explicitly filter the result to get one row only. If you use older versions of Oracle, you may have used the ‘rownum’ way of limiting rows, and this implicitly adds the first_rows hint. Here I’m using the FETCH FIRST syntax and I need to explicitely add the FIRST_ROWS() hint to get the plan optimized for that.

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 9bcm542sk64az, child number 0
-------------------------------------
select /*+ first_rows(1) */ n from demo1 where n is not null order by n fetch first 1 row only
---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 2 (100)| 1 |00:00:00.01 | 3 |
|* 1 | VIEW | | 1 | 1 | 2 (0)| 1 |00:00:00.01 | 3 |
|* 2 | WINDOW NOSORT STOPKEY| | 1 | 1 | 2 (0)| 1 |00:00:00.01 | 3 |
| 3 | INDEX FULL SCAN | DEMO1_N | 1 | 10000 | 2 (0)| 2 |00:00:00.01 | 3 |
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=1)
2 - filter(ROW_NUMBER() OVER ( ORDER BY "N")<=1)
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "from$_subquery$_002"."N"[NUMBER,22], "from$_subquery$_002"."rowlimit_$$_rownumber"[NUMBER,22]
2 - (#keys=1) "N"[NUMBER,22], "DEMO1".ROWID[ROWID,10], ROW_NUMBER() OVER ( ORDER BY "N")[22]
3 - "DEMO1".ROWID[ROWID,10], "N"[NUMBER,22]

The cost is the same, estimated to 2 random reads, but we see how Oracle implements the FETCH FIRST: with window functions. And only one row has been fetched (A-Rows) reading 3 blocks (buffers). Note that because the index is sorted, the window function is a NOSORT operation.

Postgres

I can run the same query on PostgreSQL and get the execution plan:

explain (analyze,verbose,costs,buffers) select n from demo1 where n is not null order by n fetch first 1 row only;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.29..0.31 rows=1 width=4) (actual time=0.124..0.124 rows=1 loops=1)
Output: n
Buffers: shared hit=3
-> Index Only Scan using demo1_n on public.demo1 (cost=0.29..295.29 rows=10000 width=4) (actual time=0.124..0.124 rows=1 loops=1)
Output: n
Index Cond: (demo1.n IS NOT NULL)
Heap Fetches: 0
Buffers: shared hit=3
Planning time: 0.576 ms
Execution time: 0.143 ms

Here, the total cost of the query is lower than the total cost of the Index Only Scan, because we know we will not read all index entries. Then the total cost of the query (0.31) is based on the startup cost (0.29) of the index access. I suppose there is 0.01 for the cpu_tuple_cost but I expected to see the cost to get the first page because we cannot get a row without reading the whole page. My guess is that Postgres divides the total cost (295) by the number of rows (10000) and uses that as a per-row estimation. This makes sense for a lot of rows but underestimates the cost to get the first page.

In order to validate my guess, I force a Seq Scan to have a higher cost and fetch first 5 rows:

explain (analyze,verbose,costs,buffers) select n from demo1 where n is not null fetch first 5 row only ;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..0.76 rows=5 width=4) (actual time=0.026..0.029 rows=5 loops=1)
Output: n
Buffers: shared hit=1
-> Seq Scan on public.demo1 (cost=0.00..1529.00 rows=10000 width=4) (actual time=0.022..0.024 rows=5 loops=1)
Output: n
Filter: (demo1.n IS NOT NULL)
Buffers: shared hit=1
Planning time: 1.958 ms
Execution time: 0.057 ms

My guess is: ( 1529.00 / 10000 ) * 5 = 0.7645 which is exactly the cost estimated for the Limit operation. This approximation does not take the page granularity into account.

MIN/MAX

The “order by n fetch first 1 row only” finally reads only one index entry, the first one, and returns the indexed value. We can get the same value with a “select max(N)” and Oracle has a special operation for that:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 29bsqfg69nudp, child number 0
-------------------------------------
select /*+ */ min(n) from demo1
-------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 2 (100)| 1 |00:00:00.01 | 2 |
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 2 |
| 2 | INDEX FULL SCAN (MIN/MAX)| DEMO1_N | 1 | 1 | 2 (0)| 1 |00:00:00.01 | 2 |
-------------------------------------------------------------------------------------------------------------
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=0) MIN("N")[22]
2 - "N"[NUMBER,22]

This goes through the index branches (blevel=1 here in this small index so root is the first and only one branch) to the first leaf in order to get the value in the first entry. This has read 2 blocks here. The same can be done to get the last index entry in case we “select max(N)”.

Postgres do not show a special operation for it, but a plan which is very similar to the one we have seen above when fetching the first row: Index Only Scan, with a Limit:


explain (analyze,verbose,costs,buffers) select min(n) from demo1 ;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.31..0.32 rows=1 width=4) (actual time=0.123..0.124 rows=1 loops=1)
Output: $0
Buffers: shared hit=3
InitPlan 1 (returns $0)
-> Limit (cost=0.29..0.31 rows=1 width=4) (actual time=0.121..0.121 rows=1 loops=1)
Output: demo1.n
Buffers: shared hit=3
-> Index Only Scan using demo1_n on public.demo1 (cost=0.29..295.29 rows=10000 width=4) (actual time=0.119..0.119 rows=1 loops=1)
Output: demo1.n
Index Cond: (demo1.n IS NOT NULL)
Heap Fetches: 0
Buffers: shared hit=3
Planning time: 0.415 ms
Execution time: 0.140 ms

If we look at the ‘Index Only Scan’ we see exactly what I had at the top of this post with “select n from demo1 where n is not null order by n”.

Above it, there’s the Limit clause which is exactly the same as the one with the “fetch 1 row only” because the query planner understands that getting the MIN(N) is the same as getting the first value from the ordered index on N.

This is processed as a non-correlated subquery (query block), also called InitPlan. The result of it ($0) is used by the result with an additional cost of 0.01 for the cpu_tuple_cost in this additional step. I don’t really know the reason for this additional step here, but anyway, the cost is minimal. Basically, both Oracle and Postgres take advantage of the index structure to get the minimum – or first value – from the sorted index entries.

In this series, I’m running very simple queries in order to show how it works. In this post, we reached the minimum: one column and one row. The next post will finally select one additional column, which is not in the index.

 

Cet article Postgres vs. Oracle access paths V – FIRST ROWS and MIN/MAX est apparu en premier sur Blog dbi services.

More Oracle Security Training Manuals for Sale

I advertised here some months ago a small number of printed manuals that I found in our company storage for some of my Oracle security classes. We had these printed over the years for various classes that I taught and....[Read More]

Posted by Pete On 08/08/17 At 01:57 PM

Words I Don’t Use, Part 4: “Expert”

The fourth “word I do not use” is expert.

When I was a young boy, my dad would sometimes drive me to school. It was 17 miles of country roads and two-lane highways, so it gave us time to talk.

At least once a year, and always on the first day of school, he would tell me, “Son, there are two answers to every test question. There’s the correct answer, and there’s the answer that the teacher expects. ...They’re not always the same.”

He would continue, “And I expect you to know them both.”

He wanted me to make perfect grades, but he expected me to understand my responsibility to know the difference between authority and truth. My dad thus taught me from a young age to be skeptical of experts.

The word expert always warns me of a potentially dangerous type of thinking. The word is used to confer authority upon the person it describes. But it’s ideas that are right or wrong; not people. You should evaluate an idea on its own merit, not on the merits of the person who conveys it. For every expert, there is an equal and opposite expert; but for every fact, there is not necessarily an equal and opposite fact.

A big problem with expert is corruption—when self-congratulators hijack the label to confer authority upon themselves. But of course, misusing the word erodes the word. After too much abuse within a community, expert makes sense only with finger quotes. It becomes a word that critical thinkers use only ironically, to describe people they want to avoid.

New Oracle Security On-Line Training Dates Added

We have finally added new on-line training dates for some of our classes; the very popular two days "How to perform a security audit of an Oracle Database" is first followed by the one day class "Hardening and Securing Oracle....[Read More]

Posted by Pete On 07/08/17 At 06:30 PM

Upgrading an Amazon EC2 Delphix Target, Part IV

It’s finally time to upgrade our Linux Target!  OK, so we’re not going to upgrade the way a DBA would normally upgrade a database server when we’re working with virtualization.

So far, we’ve completed:

  • 1.  Updating our instances so that we’ll have a GUI interface if we’ll need one.
  • 2.  Installed Oracle on the Linux Source and upgraded our Dsource database to 12c

 

Now we’re done with our Linux Source and onto our Linux Target.

Install and Configure VNC and Oracle

We’ll run through and install and configure the VNC Viewer requirements just like we did in Part I and Part II. We’ll also install Oracle, but only this time, we’ve performed a software installation only.

https://i1.wp.com/dbakevlar.com/wp-content/uploads/2017/07/target_12c_in... 300w, https://i1.wp.com/dbakevlar.com/wp-content/uploads/2017/07/target_12c_in... 768w, https://i1.wp.com/dbakevlar.com/wp-content/uploads/2017/07/target_12c_in... 1200w, https://i1.wp.com/dbakevlar.com/wp-content/uploads/2017/07/target_12c_in... 1800w" sizes="(max-width: 600px) 100vw, 600px" data-recalc-dims="1" />

We’ll install the Enterprise Edition and we’ll make sure to install it in the same path as we did on our Linux Source, (/u01/app/oracle/product/12.1/db_1)  We’re not installing the multi-tenant, as we didn’t configure this on our source, either.

Once that is complete, it’s time to get our VDB’s upgraded.

The first thing you need to remember is that the VDBs are simply virtual images of our Dsource that is already UPGRADED.

Add the New Oracle Home to the Linux Target

Log into Delphix Admin Console and click on Environments.

click on the Linux Target and then click on the refresh button:

https://i1.wp.com/dbakevlar.com/wp-content/uploads/2017/07/Screen-Shot-2... 300w" sizes="(max-width: 242px) 100vw, 242px" data-recalc-dims="1" />

Click on the Databases tab and you’ll now see the DB12c Oracle home is now present in the list:

https://i1.wp.com/dbakevlar.com/wp-content/uploads/2017/07/Screen-Shot-2... 300w, https://i1.wp.com/dbakevlar.com/wp-content/uploads/2017/07/Screen-Shot-2... 768w, https://i1.wp.com/dbakevlar.com/wp-content/uploads/2017/07/Screen-Shot-2... 1200w, https://i1.wp.com/dbakevlar.com/wp-content/uploads/2017/07/Screen-Shot-2... 1800w" sizes="(max-width: 600px) 100vw, 600px" data-recalc-dims="1" />

Prep VDBs for switch to new home

Copy your environments profile from 11g.env to 12c.env.  Update the Oracle home to point to the new 12c home and save the file.

Now I have three VDBs on this target:

[delphix@linuxtarget ~]$ ps -ef | grep pmon

delphix   7501     1  0 Jul12 ?        00:01:17 ora_pmon_devdb
delphix   8301     1  0 Jul06 ?        00:01:49 ora_pmon_VEmp6
delphix  16875     1  0 Jul05 ?        00:01:57 ora_pmon_qadb

Log into the Linux Target and from the command line, set the environment and log into each database via SQL Plus and shut it down.

#00ffff; font-family: Menlo, Monaco, Consolas, 'Courier New', monospace; font-size: 13px;">. 11g.env

export ORACLE_SID=VEmp6f
sqlplus / as sysdba
shutdown immediate;
exit;

and so on and so forth…. </p />
</p></div></div>

    	  	<div class=

Presentation – Working with Oracle Database in VB.Net with ODP.Net and Visual API Toolkit; Enhancing Visual Manufacturing 9.0.0

August 7, 2017 A few weeks ago I gave a presentation to a regional group of ERP users, some with an Oracle Database backend and some with a Microsoft SQL Server backend.  The original intention for the presentation was to demonstrate an application package that I created the previous year which maintains and manages production […]