Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Display Data Guard configuration in SQL Developer

The latest version of SQL Developer, the 17.2 one released after Q2 of 2017, has a new item in the DBA view showing the Data Guard configuration. This is the occasion to show how you can cascade the log shipping in Oracle 12c

A quick note about this new versioning: this is the release for 2017 Q2 and the version number has more digits to mention the exact build time. Here this version is labeled 17.2.0.188.1159 and we can see when it has been built:

SQL> select to_date('17.x.0.188.1159','rr."x.0".ddd.hh24mi') build_time from dual;
 
BUILD_TIME
--------------------
07-JUL-2017 11:59:00

Non-Cascading Standby

Here is my configuration with two standby databases:

DGMGRL> show configuration
Configuration - orcl
 
Protection Mode: MaxPerformance
Members:
orcla - Primary database
orclb - Physical standby database
orclc - Physical standby database
 
Fast-Start Failover: DISABLED
 
Configuration Status:
SUCCESS (status updated 9 seconds ago)

I have only the LogXptMode defined here, without any RedoRoutes

DGMGRL> show database orcla LogXptMode
LogXptMode = 'SYNC'

with this configuration, the broker has set the following log destination on orcla, orclb and orclc:

INSTANCE_NAME NAME VALUE
---------------- -------------------- -------------------------------------------------------------------------------------------------------------
ORCLA log_archive_dest_1 location=USE_DB_RECOVERY_FILE_DEST, valid_for=(ALL_LOGFILES, ALL_ROLES)
ORCLA log_archive_dest_2 service="ORCLB", SYNC AFFIRM delay=0 optional compression=disable max_failure=0 max_connections=1 reopen=300
db_unique_name="orclb" net_timeout=30, valid_for=(online_logfile,all_roles)
ORCLA log_archive_dest_3 service="ORCLC", SYNC AFFIRM delay=0 optional compression=disable max_failure=0 max_connections=1 reopen=300
db_unique_name="orclc" net_timeout=30, valid_for=(online_logfile,all_roles)
 
INSTANCE_NAME NAME VALUE
---------------- -------------------- -------------------------------------------------------------------------------------------------------------
ORCLB log_archive_dest_1 location=/u01/fast_recovery_area
 
INSTANCE_NAME NAME VALUE
---------------- -------------------- -------------------------------------------------------------------------------------------------------------
ORCLC log_archive_dest_1 location=/u01/fast_recovery_area

In the latest SQL Developer you have the graphical representation of it from the DBA view / Dataguard / console:

SDDG001

Cascading Standby

In 12c we can define cascading standby: instead of the primary shipping the redo to all standby databases, you can have the primary shipping to one standby only, and this one can forward the redo to another one. You define that with the RedoRoute property:


DGMGRL> edit database orcla set property redoroutes = '(local:orclb) (orclb:orclc async)';
Property "redoroutes" updated
DGMGRL> edit database orclb set property redoroutes = '(orcla:orclc async) (local:orcla)';
Property "redoroutes" updated

The first route defined in each property is applied when orcla is the primary database:

  • on orcla (local:orclb) means that orcla sends redo to orclb when primary
  • on orclb (orcla:orclc async) means that orclb sends redo to orclc when orcla is primary. LogXptMode is SYNC but overriden here with ASYNC

The second route defined in each property is applied when orclb is the primary database:

  • on orcla (orclb:orclc async) means that orclb sends redo to orclc when orclb is primary. LogXptMode is SYNC but overriden here with ASYNC
  • on orclb (local:orcla) means that orclb sends redo to orcla when primary

With this configuration, and orcla still being the primary, the broker has set the following log destination on orcla, orclb and orclc:


INSTANCE_NAME NAME VALUE
---------------- -------------------- -------------------------------------------------------------------------------------------------------------
ORCLA log_archive_dest_1 location=USE_DB_RECOVERY_FILE_DEST, valid_for=(ALL_LOGFILES, ALL_ROLES)
ORCLA log_archive_dest_2 service="ORCLB", SYNC AFFIRM delay=0 optional compression=disable max_failure=0 max_connections=1 reopen=300
db_unique_name="orclb" net_timeout=30, valid_for=(online_logfile,all_roles)
 
INSTANCE_NAME NAME VALUE
---------------- -------------------- -------------------------------------------------------------------------------------------------------------
ORCLB log_archive_dest_1 location=/u01/fast_recovery_area
ORCLB log_archive_dest_2 service="ORCLC", ASYNC NOAFFIRM delay=0 optional compression=disable max_failure=0 max_connections=1 reopen=3
00 db_unique_name="orclc" net_timeout=30, valid_for=(standby_logfile,all_roles)
 
INSTANCE_NAME NAME VALUE
---------------- -------------------- -------------------------------------------------------------------------------------------------------------
ORCLC log_archive_dest_1 location=/u01/fast_recovery_area

The show configuration from DGMGRL displays them indented to see the cascading redo shipping:

DGMGRL> show configuration
Configuration - orcl
 
Protection Mode: MaxPerformance
Members:
orcla - Primary database
orclb - Physical standby database
orclc - Physical standby database (receiving current redo)
 
Fast-Start Failover: DISABLED
 
Configuration Status:
SUCCESS (status updated 27 seconds ago)

And SQL Developer Data Guard console shows:
SDDG002

Switchover

Now the goal of defining several routes is to have all log destination automatically changed when the database role change.
I’m doing a switchover:


Connected to "orclb"
Connected as SYSDG.
DGMGRL> switchover to orclb;
Performing switchover NOW, please wait...
New primary database "orclb" is opening...
Operation requires start up of instance "ORCLA" on database "orcla"
Starting instance "ORCLA"...
ORACLE instance started.
Database mounted.
Database opened.
Connected to "orcla"
Switchover succeeded, new primary is "orclb"

Now it is orcla which cascades the orclb redo to orclc:

DGMGRL> show configuration;
Configuration - orcl
 
Protection Mode: MaxPerformance
Members:
orclb - Primary database
orcla - Physical standby database
orclc - Physical standby database (receiving current redo)
 
Fast-Start Failover: DISABLED
 
Configuration Status:
SUCCESS (status updated 74 seconds ago)

Here is how it is displayed from SQL Developer:

SDDG003

We have seen how the configuration is displayed from DGMGRL and graphically from SQL Developer. Of course, you can also query the Data Guard configuration:

SQL> select * from V$DATAGUARD_CONFIG;
 
DB_UNIQUE_NAME PARENT_DBUN DEST_ROLE CURRENT_SCN CON_ID
-------------- ----------- --------- ----------- ------
orcla orclb PHYSICAL STANDBY 3407900 0
orclc orcla PHYSICAL STANDBY 3408303 0
orclb NONE PRIMARY DATABASE 0 0

and the broker configuration:

SQL> select * from V$DG_BROKER_CONFIG;
 
DATABASE CONNECT_IDENTIFIER DATAGUARD_ROLE REDO_SOURCE ENABLED STATUS VERSION CON_ID
-------- ------------------ -------------- ----------- ------- ------ ------- ------
orcla ORCLA PHYSICAL STANDBY -UNKNOWN- TRUE 0 11.0 0
orclb ORCLB PRIMARY -N/A- TRUE 0 11.0 0
orclc ORCLC PHYSICAL STANDBY orcla TRUE 0 11.0 0

This another reason to use the broker. Once the configuration is setup and tested, you have nothing else to think about when you do a switchover. The log archive destination is automatically updated depending on the database roles.

 

Cet article Display Data Guard configuration in SQL Developer est apparu en premier sur Blog dbi services.

Postgres vs. Oracle access paths I – Seq Scan

Here is the first test I’ve done for my Postgres vs. Oracle access paths series and the first query did a sequential scan. It illustrates the first constant you find in the documentation for the query planner:
seq_page_cost (floating point)
Sets the planner’s estimate of the cost of a disk page fetch that is part of a series of sequential fetches. The default is 1.0.

Table creation

I start by creating a very simple table with 10000 rows and 3 columns. The first column(n) is indexed:

create table demo1 as select generate_series n , 1 a , lpad('x',1000,'x') x from generate_series(1,10000);
SELECT 10000
create unique index demo1_n on demo1(n);
CREATE INDEX
 
analyze verbose demo1;
INFO: analyzing "public.demo1"
INFO: "demo1": scanned 1429 of 1429 pages, containing 10000 live rows and 0 dead rows; 10000 rows in sample, 10000 estimated total rows
ANALYZE
select relkind,relname,reltuples,relpages from pg_class where relname='demo1';
relkind | relname | reltuples | relpages
---------+---------+-----------+----------
r | demo1 | 10000 | 1429
 
select relkind,relname,reltuples,relpages from pg_class where relname='demo1_n';
relkind | relname | reltuples | relpages
---------+---------+-----------+----------
i | demo1_n | 10000 | 30

I checked the table and index statistics that will be used by the optimizer: 10000 rows, all indexed, 1429 table blocks and 30 index blocks. Note that blocks are called pages, but that’s the same idea: the minimal size read and written to disk. They are also called buffers as they are read into a buffer and cached in the buffer cache.

Here is how I create a similar table in Oracle:

create table demo1 as select rownum n , 1 a , lpad('x',1000,'x') x from xmltable('1 to 10000');
Table created.
create unique index demo1_n on demo1(n);
Index created.
exec dbms_stats.gather_table_stats(user,'demo1');
PL/SQL procedure successfully completed.
 
select table_name,num_rows,blocks from user_tables where table_name='DEMO1';
 
TABLE_NAME NUM_ROWS BLOCKS
---------- ---------- ----------
DEMO1 10000 1461
 
select index_name,num_rows,leaf_blocks,blevel from user_indexes where table_name='DEMO1';
 
INDEX_NAME NUM_ROWS LEAF_BLOCKS BLEVEL
---------- ---------- ----------- ----------
DEMO1_N 10000 20 1

The same rows are stored in 1421 table blocks and the index entries in 20 blocks. Both use 8k blocks, but different storage layout and different defaults. This is about 7 rows per table blocks, for rows that are approximately larger than 1k and about 500 index entries per index block to store the number for column N plus the pointer to table row (a few bytes called TID in Postgres or ROWID for Oracle). I’ll not get into the details of the number here. More about the row storage:

My goal is to detail the execution plans and the execution statistics.

Postgres Seq Scan

I start with a very simple query on my table: SELECT SUM(N) from DEMO1;


explain (analyze,verbose,costs,buffers) select sum(n) from demo1 ;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1554.00..1554.01 rows=1 width=8) (actual time=4.616..4.616 rows=1 loops=1)
Output: sum(n)
Buffers: shared hit=1429
-> Seq Scan on public.demo1 (cost=0.00..1529.00 rows=10000 width=4) (actual time=0.011..3.614 rows=10000 loops=1)
Output: n, a, x
Buffers: shared hit=1429
Planning time: 0.468 ms
Execution time: 4.661 ms

This query does a sequential scan (Seq Scan), which is the equivalent of Oracle Full Table Scan: read all rows from the table. You might tell me that it would be cheaper to scan the index because the index I’ve created holds all required columns. We will see that in the next post. Here, after having created the table as I did above, the query planner prefers to scan the table.

Here are the maths: my table has 1429 pages and each page access during a sequential scan has cost=1 as defined by:

show seq_page_cost;
seq_page_cost
---------------
1

Here, I see a cost estimated from 0 to 1529 for the Seq Scan operation.
The first number, 0.00 is the initialization cost estimating the work done before returning any rows. A Seq Scan has nothing to do before, and reading the first block can already return rows.
The second number is the cost to return all rows. We have seen that the scan itself costs 1429 but the rows (tuples) must be read and processed. This is evaluated using the following constant:

show cpu_tuple_cost;
cpu_tuple_cost
----------------
0.01

For 10000 rows, the cost to process them is 0.01*10000=100 which is an additional cost over the Seq Scan 1429 to get it to 1529. This explains cost=0.00..1529.00

Then there is a SUM operation applied to 10000 rows and there is a single parameter for the CPU cost of operators and functions:

show cpu_operator_cost;
cpu_operator_cost
-------------------
0.0025

Capturepgoraseqscan001
The sum (Aggregate) operation adds 0.0025*10000=25 to the cost and then the cost is 1554. You can see this cost in the minimal cost for the query, the first number in cost=1554.00..1554.01, which is the cost before retrieving any rows. This makes sense because before retrieving the first row we need to read (Seq Scan) and process (Aggregate) all rows, which is exactly what the cost of 1554 is.

Then there is an additional cost when we retrieve all rows. It is only one row here because it is a sum without group by, and this adds the default cpu_tuple_cost=0.01 to the initial cost: 1554.01

In summary, The total cost of the query is cost=1554.00..1554.01 and we have seen that it depends on:
– number of pages in the table
– number of rows from the result of the scan (we have no where clause here)
– number of rows summed and retrieved
– the planner parameters seq_page_cost, cpu_tuple_cost, and cpu_operator_cost

Oracle Full Table Scan

When I run the same query on Oracle, the optimizer chooses an index fast full scan rather than a table full scan because all rows and columns are in the index that I’ve created:

  • all rows because the SUM(N) do not need to get rows where N is not null (which are not stored in the index)
  • all columns because I need nothing else than the values for N

We will see that in the next post, for the moment, in order to compare with Postgres, I forced a full table scan with the FULL() hint.

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID bhsjquhh6y08q, child number 0
-------------------------------------
select /*+ full(demo1) */ sum(n) from demo1
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 397 (100)| 1 |00:00:00.01 | 1449 |
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.01 | 1449 |
| 2 | TABLE ACCESS FULL| DEMO1 | 1 | 10000 | 397 (0)| 10000 |00:00:00.01 | 1449 |
---------------------------------------------------------------------------------------------------
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=0) SUM("N")[22]
2 - (rowset=256) "N"[NUMBER,22]

We have seen that Postgres cost=1 is for sequential scans (similar to what we call multiblock reads in Oracle) and random reads (single block reads) have by default cost=4 according to random_page_cost.

Oracle cost unit is based on single block reads and this is why the cost here (397) is lower than the number of blocks (1461). Different units. Postgres counts cost=1 for reads and counts a higher cost when a seek is involved. Oracle counts cost=1 for single block reads (including seek) and lower cost for larger I/O size.
Capturepgoraseqscan002
With the default system statistics, where latency is estimated 10 milliseconds and transfer is estimated to 4KB/ms. The single block read time is estimated to 12 milliseconds (10 + 8192/4096).
Again with the default system statistics where optimizer estimates 8 blocks per multiblock read, the multiblock read time is estimated to 26 milliseconds (10 + 8*8192/4096) which is on average 26/8=3.25 millisecond per block. This means that the ratio of single vs. multi block read is very similar for Oracle (3.25/12=0.27833333) and Postgres (seq_page_cost /random_page_cost=1/4=0.25) with default parameters.

Our table is stored in 1461 blocks and the full table scan involves reading all of them plus some segment header blocks. 1461*0.27833333=396

There is also the costing of CPU (the equivalent to cpu_tuple_cost) which is included here but I’ll not go into the details which are more complex than in Postgres and depends on your processor frequency. The goal of those posts is about Postgres. For Oracle, all this is explained in Jonathan Lewis and Chris Antognini books.

But basically, the idea is the same: Postgres Seq Scan and Oracle Full table Scan read the contiguous table blocks sequentially and the cost mainly depends on the size of the table (number of blocks) and the estimated time for sequential I/O (where bandwidth counts more than latency).

Buffers

In my tests, I’ve not only explained the query, but I executed it to get execution statistics. This is done with EXPLAIN ANALYZE in Postgres and DBMS_XPLAN.DISPLAY_CURSOR in Oracle. The statistics include the number of blocks read at each plan operation, with the BUFFERS option in Postgres and with STATISTICS_LEVEL=ALL in Oracle.


explain (analyze,buffers) select sum(n) from demo1 ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Aggregate (cost=1554.00..1554.01 rows=1 width=8) (actual time=3.622..3.622 rows=1 loops=1)
Buffers: shared hit=1429
-> Seq Scan on demo1 (cost=0.00..1529.00 rows=10000 width=4) (actual time=0.008..1.724 rows=10000 loops=1)
Buffers: shared hit=1429
Planning time: 0.468 ms
Execution time: 4.661 ms

‘Buffers’ displays the number of blocks that have been read by the Seq Scan and is exactly the number of pages in my table. ‘shared hit’ means that they come from the buffer cache.

Let’s run the same when the cache is empty:

explain (analyze,verbose,costs,buffers) select sum(n) from demo1 ;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1554.00..1554.01 rows=1 width=8) (actual time=13.837..13.837 rows=1 loops=1)
Output: sum(n)
Buffers: shared read=1429
-> Seq Scan on public.demo1 (cost=0.00..1529.00 rows=10000 width=4) (actual time=0.042..12.506 rows=10000 loops=1)
Output: n, a, x
Buffers: shared read=1429
Planning time: 3.754 ms
Execution time: 13.906 ms

The buffers are now ‘shared read’ instead of ‘shared hit’. In Postgres, the number of logical reads, as we know them in Oracle, is the sum of hits and reads. In Oracle, all blocks are counted as logical reads, which includes the smaller set of physical reads.

IO calls

Here is more about the reads when the block is not in the buffer cache. On Linux, we can trace the system calls to see how those sequential I/Os are implemented.

I get the ‘relfilenode':

postgres=# select relname,relnamespace,reltype,relowner,relfilenode,relpages,reltuples from pg_class where relname='demo1';
relname | relnamespace | reltype | relowner | relfilenode | relpages | reltuples
---------+--------------+---------+----------+-------------+----------+-----------
demo1 | 2200 | 42429 | 10 | 42427 | 1429 | 10000

I get the pid of my session process:

select pg_backend_pid();
-[ RECORD 1 ]--+------
pg_backend_pid | 30732

I can trace system calls:

strace -p 30732

And look at the trace concerning my file (identified with its ‘relfilenode’):

30732 open("base/12924/42427", O_RDWR) = 33
30732 lseek(33, 0, SEEK_END) = 11706368
30732 open("base/12924/42427_vm", O_RDWR) = 43
30732 lseek(33, 0, SEEK_END) = 11706368
30732 lseek(33, 0, SEEK_END) = 11706368
30732 lseek(33, 0, SEEK_SET) = 0
30732 read(33, "\4004\220\3 \4 \360\233\30\10\340\227\30\10"..., 8192) = 8192
30732 read(33, "\4004\220\3 \4 \360\233\30\10\340\227\30\10"..., 8192) = 8192
30732 read(33, "\4004\220\3 \4 \360\233\30\10\340\227\30\10"..., 8192) = 8192
... 1429 read(33) in total

We see two open() calls with the relfilenode of my table in the file name: one for the table and one for the visibility map
The file descriptor for the table file is 33 and I’ve grepped only the related calls.
The lseek(33,0,SEEK_END) goes to the end of the file (11706368 bytes, which is 11706368/8192=1429 pages.
The lseek(33,0,SEEK_SET) goes to the beginning of the file.
Subsequent read() calls read the whole file, reading page per page (8192 bytes), in sequential order.

This is how sequential reads are implemented in Postgres: one lseek() and sequential read() calls. The I/O size is always the same (8k here). The benefit of sequential scan is not larger I/O calls but simply the absence of seek() in between. The optimization is left to the underlying layers filesystem and read-ahead.

This is very different from Oracle. Not going into the details, here are the kind of system calls you see during the full table scan:

open("/u01/oradata/CDB1A/PDB/users01.dbf", O_RDWR|O_DSYNC) = 9
fcntl(9, F_SETFD, FD_CLOEXEC) = 0
fcntl(9, F_DUPFD, 256) = 258
...
pread(258, "\6\242\2\5\3\276\25%\2\4\24\270\1\313!\1x\25%"..., 1032192, 10502144) = 1032192
pread(258, "\6\242\202\5\3\300\25%\2\4\16\247\1\313!\1x\25%"..., 1032192, 11550720) = 1032192
pread(258, "\6\242\2\6\3\302\25%\2\4x\226\1\313!\1x\25%"..., 417792, 12599296) = 417792

Those are also sequential reads of contiguous blocks but done with larger I/O size (126 blocks here). So in addition to the absence of seek() calls, it is optimized to do less I/O calls, not relying on the underlying optimization at OS level.

Oracle can also trace the system calls with wait events, which gives more information about the database calls:

WAIT #140315986764280: nam='db file scattered read' ela= 584 file#=12 block#=1282 blocks=126 obj#=74187 tim=91786554974
WAIT #140315986764280: nam='db file scattered read' ela= 485 file#=12 block#=1410 blocks=126 obj#=74187 tim=91786555877
WAIT #140315986764280: nam='db file scattered read' ela= 181 file#=12 block#=1538 blocks=51 obj#=74187 tim=91786556380

The name ‘scattered’ is misleading. ‘db file scattered read’ are actually multiblock reads: read more than one block in one I/O call. Oracle does not rely on the Operating System read-ahead and this is why we can (and should) use direct I/O and Async I/O if the database buffer cache is correctly sized.

Output and Projection

I’ve run the EXPLAIN with the VERBOSE option which shows the ‘Output’ for each operation, and I’ve done the equivalent in Oracle by adding the ‘+projection’ format in DBMS_XPLAN.

In the Oracle execution plan, we see the columns remaining in the result of each operation, after the projection:

Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=0) SUM("N")[22]
2 - (rowset=256) "N"[NUMBER,22]

The operation 2, the Full Table Scan, reads all rows with all columns, but selects only the one we need: N

In the Postgres equivalent, it seems that the Output mentions the columns available before the projection because we see all table columns here:

explain verbose select sum(n) from demo1 ;
QUERY PLAN
-------------------------------------------------------------------------
Aggregate (cost=1554.00..1554.01 rows=1 width=8)
Output: sum(n)
-> Seq Scan on public.demo1 (cost=0.00..1529.00 rows=10000 width=4)
Output: n, a, x

I prefer to see the columns after the projection and I use it a lot in Oracle to know which columns are needed from the table. A great optimization can be done when we have a covering index where all selected columns are present so that we don’t have to go to the table. But we will see that in the next post about Index Only Scan.

 

Cet article Postgres vs. Oracle access paths I – Seq Scan est apparu en premier sur Blog dbi services.

Postgres unique constraint

I’ll start a series on Postgres vs. Oracle access paths because I know Oracle and I learn Postgres. While preparing it, I came upon some surprises because I’m so used to Oracle that I take some behavior as granted for any SQL databases. I recently posted a tweet about one of them, comparing latest Postgres version to earliest Oracle version I have on my laptop.
The goal of the tweet was exactly what I said above: show my surprise, using Oracle 7 as a reference because this is the version where I started to learn SQL. And there’s no judgment behind this surprise: I can’t compare a software I use for more than 20 years with one I’m just learning. I have a big admiration for the Oracle design and architecture choices. But I’ve also a big admiration for what the Postgres community is doing.

In my tweet I’ve updated a primary key. I think I’ve never designed in real life a primary key that has to be updated later. For each table we need a key that is immutable to identify rows for referential integrity constraints, or for replication. The value must be known from the first insert (which means the columns are declared not null) and the value is never updated. It makes sense to use a primary key for that as it is unique and not null.

Actually, a better case would be a simple unique constraint where we just exchange two rows. A real-life example is a list of items, having probably a surrogate key as the primary key, and a unique key including an item number. When the user wants to move up one item, we just run an update on two rows, exchanging their numbers. The unique constraint just ensures that we have only distinct values so that a select … order by will always return the values in the same order.

All similar cases have the same consequence: when you process row by row the update, the uniqueness may be violated. But at the end of the statement, the constraint is still valid.

Here is the initial example with updating all rows:


create table demo as select generate_series n from generate_series(1,2);
SELECT 2
alter table demo add constraint demo_pk primary key(n);
ALTER TABLE
select * from demo;
n
---
1
2
(2 rows)
 
begin transaction;
BEGIN
update demo set n=n-1;
UPDATE 2
select * from demo;
n
---
0
1
(2 rows)

This works. I’ve inserted the rows in ascending order of n. Decreasing the value doesn’t violate the uniqueness at any time because it reads rows from the beginning to the end.

However, when we increase the value, we have a duplicate value until we process the next row. And by default, Postgres fails:

update demo set n=n+1;
ERROR: duplicate key value violates unique constraint "demo_pk"
DETAIL: Key (n)=(1) already exists.

The bad thing is that the behavior of the application depends on the physical order of the rows and the order where they are processed. This violates the Codd rule about physical independence. In addition to that, the SQL statements should behave as processing the set of rows rather than low-level row-by-row processing.

But there is also a very good thing: because the constraint is validated row by row, you know which value violates the constraint (here: “DETAIL: Key (n)=(1) already exists” ).

So my statement failed and this in Postgres seems to fail the whole transaction:

commit;
ROLLBACK

My second surprise is that the failure of one statement cancels the whole transaction. I see no error at commit, but it simply tells me that it has done a rollback instead of the commit.

deferrable

So, I compared with Oracle where this statement is always successful, because temporary violations that are resolved later, within the same statement, do not violate the constraint. I compared it with the oldest version I have on my laptop (Oracle 7.3) to show that it is something I’ve never seen as a new feature because I started with Oracle 7. And this kind of thing is the reason why I like SQL. Doing the same with a procedural language requires an intermediate update to be sure that there is no duplicate at any time.

The Postgres community is very responsive, especially when we may think that something works better in Oracle than Postgres (which was not the case here and which was not the goal of my tweet anyway – but tweets are short and may not express the tone properly).

Quickly a solutions were proposed: deferred constraint (example in this blog post).

I know deferred constraints in Oracle. They are similar in Postgres and here is the solution proposed:


alter table demo drop constraint demo_pk;
ALTER TABLE
alter table demo add constraint demo_pk primary key(n) deferrable initially deferred;
ALTER TABLE
begin transaction;
BEGIN
update demo set n=n-1;
UPDATE 2
select * from demo;
n
---
0
1
(2 rows)
 
update demo set n=n+1;
UPDATE 2

That seems good. Because the constraint validation is deferred, the update is successful.

However, this is not what I want. I want the previous statement to succeed, but I want the following statement to fail:

insert into demo values(1);
INSERT 0 1

Because constraint is deferred, this statement is successful and it is only at commit that it fails:

commit;
ERROR: duplicate key value violates unique constraint "demo_pk"
DETAIL: Key (n)=(1) already exists.

Why do I think this is not the good solution? First, because I want the statement to fail as soon as possible. And in addition to that, I want the commit to be fast. Doing expensive things at commit should be avoided, if possible. It is the point where all work is supposed to be done and you just want to save it (make it durable and visible to others).

deferrable initially immediate

Actually, the solution is to declare the constraint as deferrable, but not deferred.

alter table demo drop constraint demo_pk;
ALTER TABLE
alter table demo add constraint demo_pk primary key(n) deferrable initially immediate;
ALTER TABLE

This says that it is deferrable, but not deferred (except if you decide to set the constraint deferred for your transaction). That way it accepts temporary constraint violation if they are resolved at the end of the statement.

Now, my update statement is sucessful:

begin transaction;
BEGIN
update demo set n=n-1;
UPDATE 2
select * from demo;
n
---
0
1
(2 rows)
update demo set n=n+1;
UPDATE 2

Any other statement that violates the constraint fails immediately:

insert into demo values(1);
ERROR: duplicate key value violates unique constraint "demo_pk"
DETAIL: Key (n)=(1) already exists.
commit;
ROLLBACK

Documentation

The nice thing is that this is documented! I didn’t find it immediately because it is in the ‘Compatibility’ part of the ‘create table’ documentation. I’m not yet used to the Postgres documentation. I stopped at the ‘DEFERRED’ definition which mentions: A constraint that is not deferrable will be checked immediately after every command

But later Compatibility adds something more specific to the unique constraint:

Non-deferred Uniqueness Constraints
When a UNIQUE or PRIMARY KEY constraint is not deferrable, PostgreSQL checks for uniqueness immediately whenever a row is inserted or modified. The SQL standard says that uniqueness should be enforced only at the end of the statement; this makes a difference when, for example, a single command updates multiple key values. To obtain standard-compliant behavior, declare the constraint as DEFERRABLE but not deferred (i.e., INITIALLY IMMEDIATE). Be aware that this can be significantly slower than immediate uniqueness checking.

That’s another good point. Postgres documentation is clear and gives the right solution. We just have to read it to the end.

A side note for my French speaking readers here to mention that the Postgres documentation has been translated into French by Guillaume Lelarge, who also translated Markus Winand book and website. Translation is as good as the original in both cases.

Performance

The documentation mentions ‘significantly slower’. Here is a test on 100000 rows with non deferable constraint:

create table demo as select generate_series n from generate_series(1,100000);
SELECT 100000
alter table demo add constraint demo_pk primary key(n);
ALTER TABLE
vacuum demo;
VACUUM
select * from pgstatindex('demo_pk');
version | tree_level | index_size | root_block_no | internal_pages | leaf_pages | empty_pages | deleted_pages | avg_leaf_density | leaf_fragmentation
---------+------------+------------+---------------+----------------+------------+-------------+---------------+------------------+--------------------
2 | 1 | 2260992 | 3 | 1 | 274 | 0 | 0 | 89.83 | 0
(1 row)

Here is the update n=n-1 where all rows are updated but none violates the constraint at any time:

explain (analyze,verbose,costs,buffers)update demo set n=n-1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Update on public.demo (cost=0.00..1693.00 rows=100000 width=10) (actual time=425.699..425.699 rows=0 loops=1)
Buffers: shared hit=578646 read=1202 dirtied=1267
-> Seq Scan on public.demo (cost=0.00..1693.00 rows=100000 width=10) (actual time=0.013..16.186 rows=100000 loops=1)
Output: (n - 1), ctid
Buffers: shared hit=443

This update has read 578646+1202=579848 buffers.

Now creating the deferrable constraint:

alter table demo drop constraint demo_pk;
ALTER TABLE
alter table demo add constraint demo_pk primary key(n) deferrable initially immediate;
ALTER TABLE
vacuum demo;
VACUUM
select * from pgstatindex('demo_pk');
version | tree_level | index_size | root_block_no | internal_pages | leaf_pages | empty_pages | deleted_pages | avg_leaf_density | leaf_fragmentation
---------+------------+------------+---------------+----------------+------------+-------------+---------------+------------------+--------------------
2 | 1 | 2260992 | 3 | 1 | 274 | 0 | 0 | 89.83 | 0

And do the n=n+1 update:

explain (analyze,verbose,costs,buffers)update demo set n=n+1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Update on public.demo (cost=0.00..2135.00 rows=100000 width=10) (actual time=481.868..481.868 rows=0 loops=1)
Buffers: shared hit=679405 read=760 dirtied=825
-> Seq Scan on public.demo (cost=0.00..2135.00 rows=100000 width=10) (actual time=0.268..16.329 rows=100000 loops=1)
Output: (n + 1), ctid
Buffers: shared hit=885
Planning time: 0.237 ms
Trigger PK_ConstraintTrigger_75314 for constraint demo_pk: time=174.976 calls=99999
Execution time: 663.799 ms

This read more buffers and we can see that an internal trigger (PK_ConstraintTrigger_75314) has been run to re-check the unique constraint at the end of the statement. But only 17% more here for this special case where all rows are updated.

However, a more realistic test case exchanging only two values is much cheaper:


explain (analyze,verbose,costs,buffers) update demo set n=case when n=2 then 2000 when n=2000 then 2 end where n in (2,2000);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Update on public.demo (cost=8.85..16.60 rows=2 width=10) (actual time=0.079..0.079 rows=0 loops=1)
Buffers: shared hit=23
-> Bitmap Heap Scan on public.demo (cost=8.85..16.60 rows=2 width=10) (actual time=0.016..0.055 rows=2 loops=1)
Output: CASE WHEN (n = 2) THEN 2000 WHEN (n = 2000) THEN 2 ELSE NULL::integer END, ctid
Recheck Cond: (demo.n = ANY ('{2,2000}'::integer[]))
Heap Blocks: exact=3
Buffers: shared hit=9
-> Bitmap Index Scan on demo_pk (cost=0.00..8.85 rows=2 width=0) (actual time=0.009..0.009 rows=4 loops=1)
Index Cond: (demo.n = ANY ('{2,2000}'::integer[]))
Buffers: shared hit=6
Planning time: 0.137 ms
Trigger PK_ConstraintTrigger_75322 for constraint demo_pk: time=0.005 calls=1
Execution time: 0.120 ms

In my opinion, the overhead here is totally acceptable, especially given the fact that this re-check displays exactly which value violates the constraint in case there is a duplicate.

But I’m going too fast here. I’ve not even started my blog series about access paths where I’ll explain the cost of the execution plans, starting from the most simple: Seq Scan. Follow my blog or twitter to get informed. There will be nothing about ‘which is better, Oracle or Postgres?’. But I’m convinced that knowing the difference helps to understand how it works, and to design an application that has the correct behavior if ported from one to the other.

 

Cet article Postgres unique constraint est apparu en premier sur Blog dbi services.

Postgres vs. Oracle access paths – intro

This is the start of a series on PostgreSQL execution plans, access path, join methods, hints and execution statistics. The approach will compare Postgres and Oracle. It is not a comparison to see which one is better, but rather to see what is similar and where the approaches diverge. I have a long experience of reading Oracle execution plans and no experience at all on Postgres. This is my way to learn and share what I learn. You will probably be interested if you are in the same situation: an Oracle DBA wanting to learn about Postgres. But you may also be an experienced Postgres DBA who wants to see a different point of view from a different ‘culture’.

I’ll probably use the Oracle terms more often as I’m more familiar with them: blocks for pages, optimizer for query planner, rows for tuples, tables for relations…

Please, don’t hesitate to comment on the blog posts or through twitter (@FranckPachot) if you find some mistakes in my Postgres interpretation. I tend to verify any assumption in the same way I do it with Oracle: the documented behavior and the test result should match. My test should be fully reproducible (using Postgres 9.6.2 here with all defaults). But as I said above, I’ve not the same experience as I have on Oracle when interpreting execution statistics.

Postgres

I’m using the latest versions here. Postgres 9.6.2 (as the one I installed here)
I’ve installed pg_hint_plan to be able to control the execution plan with hints. This is mandatory when doing some research. In order to understand an optimizer (query planner) choice, we need to see the estimated cost for different possibilities. Most of my tests will be done with: EXPLAIN (ANALYZE,VERBOSE,COSTS,BUFFERS)

fpa=# explain (analyze,verbose,costs,buffers) select 1;
 
QUERY PLAN
------------------------------------------------------------------------------------
Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)
Output: 1
Planning time: 0.060 ms
Execution time: 0.036 ms
(4 rows)

I my go further with unix tools (like strace to see the system calls)

Oracle

I’m using Oracle 12.2 here and the tests are done by running the statement after setting ALTER SESSION SET STATISTICS_LEVEL=ALL and displaying the execution plan with DBMS_XPLAN:
select * from dbms_xplan.display_cursor(format=>'+cost allstats last -plan_hash +projection');
Note that if you are in lower Oracle versions, you need to call dbms_xplan through the table() function:
select * from table(dbms_xplan.display_cursor(format=>'+cost allstats last -plan_hash +projection'));
Example:

SQL> set arraysize 5000 linesize 150 trimspool on pagesize 1000 feedback off termout off
SQL> alter session set statistics_level=all;
SQL> select 1 from dual;
SQL> set termout on
SQL> select * from dbms_xplan.display_cursor(format=>'+cost allstats last -plan_hash +projection');
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 520mkxqpf15q8, child number 0
-------------------------------------
select 1 from dual
--------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 2 (100)| 1 |00:00:00.01 |
| 1 | FAST DUAL | | 1 | 1 | 2 (0)| 1 |00:00:00.01 |
--------------------------------------------------------------------------------------

I’ll probably never compare the execution time, as this depends on the system and makes no sense on artificial small examples. But I’ll try to compare all other statistics: estimated cost, the actual number of pages/blocks read, etc.

Table of content

I’ll update (or rather insert /*+ append */) the links to the series posts as soon as they are published.

  1. Postgres vs. Oracle access paths I – Seq Scan
  2. Postgres vs. Oracle access paths II – Index Only Scan
  3. Postgres vs. Oracle access paths III – Partial Index
  4. Postgres vs. Oracle access paths IV – Order By and Index
  5. Postgres vs. Oracle access paths V – FIRST ROWS and MIN/MAX
 

Cet article Postgres vs. Oracle access paths – intro est apparu en premier sur Blog dbi services.

Interval partitioning just got better

Interval partitioning was a great feature when it arrived in version 11, because we no longer had to worry so much about ensuring partitions were available for new data when it arrived.  Partitions would just be created on the fly as required.  I’m not going to talk about interval partition in detail because there’s plenty of good content already out there.  But one key element for interval partitioning is that the intervals have to start from somewhere, which is why you always have to define a table with at least one partition.

 

image

 

So what if I want to drop that partition that is the conceptual “starting point”.  Well…I get problems Smile


SQL> create table sales
  2  ( tstamp       timestamp    not null,
  3    empno        number(10)   not null,
  4    ename        varchar2(10) not null,
  5    deptno       varchar2(10) not null
  6  )
  7  partition by range (tstamp)
  8  interval( numtoyminterval(1,'YEAR'))
  9  (
 10    partition p00 values less than (timestamp '2010-01-01 00:00:00')
 11  );

Table created.

SQL>
SQL> insert into sales values ( timestamp '2011-01-01 00:00:00' , 0,0,0);

1 row created.

SQL> insert into sales values ( timestamp '2012-01-01 00:00:00' , 0,0,0);

1 row created.

SQL> insert into sales values ( timestamp '2013-01-01 00:00:00' , 0,0,0);

1 row created.

SQL>
SQL> col high_value format a36
SQL> col pname format a10
SQL>
SQL> select
  2    partition_name pname,
  3    partition_position pos,
  4    high_value,
  5    interval
  6  from   user_tab_partitions
  7  where  table_name = 'SALES';

PNAME             POS HIGH_VALUE                           INT
---------- ---------- ------------------------------------ ---
P00                 1 TIMESTAMP' 2010-01-01 00:00:00'      NO
SYS_P121            2 TIMESTAMP' 2012-01-01 00:00:00'      YES
SYS_P122            3 TIMESTAMP' 2013-01-01 00:00:00'      YES
SYS_P123            4 TIMESTAMP' 2014-01-01 00:00:00'      YES

4 rows selected.

SQL>
SQL> alter table sales drop partition p00;
alter table sales drop partition p00
                                 *
ERROR at line 1:
ORA-14758: Last partition in the range section cannot be dropped

There is a simple workaround for that issue. If you re-issue the INTERVAL definition for the table, all of the existing partitions will be “upgraded” (or should be it downgraded…I dunno) to being range partitions. Hence we will now have “moved” the starting point, and can then drop the problem partition.


SQL> alter table sales set interval( numtoyminterval(1,'YEAR'));

Table altered.

SQL> select
  2    partition_name pname,
  3    partition_position pos,
  4    high_value,
  5    interval
  6  from   user_tab_partitions
  7  where  table_name = 'SALES';

PNAME             POS HIGH_VALUE                           INT
---------- ---------- ------------------------------------ ---
P00                 1 TIMESTAMP' 2010-01-01 00:00:00'      NO
SYS_P121            2 TIMESTAMP' 2012-01-01 00:00:00'      NO
SYS_P122            3 TIMESTAMP' 2013-01-01 00:00:00'      NO
SYS_P123            4 TIMESTAMP' 2014-01-01 00:00:00'      NO

4 rows selected.

SQL> alter table sales drop partition p00;

Table altered.

So that’s all pretty easy, but of course, you must now run this “re-interval” command all the time to be sure that you will always be able to drop any partition you want.

Unless of course….. you’re on 12.2 ! Let’s repeat the demo on 12.2


SQL> create table sales
  2  ( tstamp       timestamp    not null,
  3    empno        number(10)   not null,
  4    ename        varchar2(10) not null,
  5    deptno       varchar2(10) not null
  6  )
  7  partition by range (tstamp)
  8  interval( numtoyminterval(1,'YEAR'))
  9  (
 10    partition p00 values less than (timestamp '2010-01-01 00:00:00')
 11  );

Table created.

SQL>
SQL> insert into sales values ( timestamp '2011-01-01 00:00:00' , 0,0,0);

1 row created.

SQL> insert into sales values ( timestamp '2012-01-01 00:00:00' , 0,0,0);

1 row created.

SQL> insert into sales values ( timestamp '2013-01-01 00:00:00' , 0,0,0);

1 row created.

SQL>
SQL> col high_value format a36
SQL> col pname format a10
SQL>
SQL> select
  2    partition_name pname,
  3    partition_position pos,
  4    high_value,
  5    interval
  6  from   user_tab_partitions
  7  where  table_name = 'SALES';

PNAME             POS HIGH_VALUE                           INT
---------- ---------- ------------------------------------ ---
P00                 1 TIMESTAMP' 2010-01-01 00:00:00'      NO
SYS_P3415           2 TIMESTAMP' 2012-01-01 00:00:00'      YES
SYS_P3416           3 TIMESTAMP' 2013-01-01 00:00:00'      YES
SYS_P3417           4 TIMESTAMP' 2014-01-01 00:00:00'      YES

4 rows selected.

SQL>
SQL> alter table sales drop partition p00;

Table altered.

SQL>
SQL> select
  2    partition_name pname,
  3    partition_position pos,
  4    high_value,
  5    interval
  6  from   user_tab_partitions
  7  where  table_name = 'SALES';

PNAME             POS HIGH_VALUE                           INT
---------- ---------- ------------------------------------ ---
SYS_P3415           1 TIMESTAMP' 2012-01-01 00:00:00'      NO
SYS_P3416           2 TIMESTAMP' 2013-01-01 00:00:00'      YES
SYS_P3417           3 TIMESTAMP' 2014-01-01 00:00:00'      YES

3 rows selected.

How cool is that! We now automatically modify one of the interval partitions to being a range partition, so you’ll not get the error.

The little features are often the coolest Smile

PostgreSQL on Cygwin

I run my laptop with Windows 10 for office programs, and VirtualBox machines with Linux for the big stuff (Oracle databases). I have also Cygwin installed on Windows for GNU programs. I wanted to quickly install PosgreSQL and rather than installing it in a Linux VM, or as a Windows program, I installed the Cygwin version of it. Here is how.

Cygwin

Cygwin is easy to install: just run the setup-x86_64.exe from https://www.cygwin.com/ and choose the packages you want to install. Here is what is related to PostgreSQL:
CapturePGCY0001

Note that if you want to install postgres extensions you may need pg_config and you need to install the libpd-devel in addition to postgresql-devel. And gcc and make. Those are not displayed in the screenshot above but you may get something like the following, if you don’t have them, when installing an extension:
pg_config: Command not found

Of course, PostgreSQL is Open Source and you can also compile it yourself.

Cygserver

Cygwin can run daemons through a Windows service (Cygserver) and you need to set it up if not already done. For this step, you will need to run the Cygwin Terminal as Administrator.
fpa@dell-fpa ~
$ /usr/bin/cygserver-config
Overwrite existing /etc/cygserver.conf file? (yes/no) yes
Generating /etc/cygserver.conf file
 
Warning: The following function requires administrator privileges!
 
Do you want to install cygserver as service?
(Say "no" if it's already installed as service) (yes/no) yes
 
The service has been installed under LocalSystem account.
To start it, call `net start cygserver' or `cygrunsrv -S cygserver'.
 
Further configuration options are available by editing the configuration
file /etc/cygserver.conf. Please read the inline information in that
file carefully. The best option for the start is to just leave it alone.
 
Basic Cygserver configuration finished. Have fun!

You start this service as any Windows service:

fpa@dell-fpa ~
$ net start cygserver
The CYGWIN cygserver service is starting.
The CYGWIN cygserver service was started successfully.

You can check from that the service is running:

fpa@dell-fpa ~
$ cygstart services.msc

CapturePGCY0002

PostgreSQL database cluster

Here is the creation of the PostgreSQL database cluster.
fpa@dell-fpa ~
$ /usr/sbin/initdb -D /usr/share/postgresql/data
The files belonging to this database system will be owned by user "fpa".
This user must also own the server process.
 
The database cluster will be initialized with locale "C".
The default database encoding has accordingly been set to "SQL_ASCII".
The default text search configuration will be set to "english".
 
Data page checksums are disabled.
 
creating directory /usr/share/postgresql/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 30
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
 
WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
 
Success. You can now start the database server using:
 
/usr/sbin/pg_ctl -D /usr/share/postgresql/data -l log.txt start

Start PostgreSQL database server

I add my network onto the /usr/share/postgresql/data/postgresql.conf

# IPv4 local connections:
host all all 127.0.0.1/32 trust
host all all 192.168.78.0/24 trust

I define the interface and port where the server listen in /usr/share/postgresql/data/postgresql.conf

listen_addresses = 'localhost,192.168.78.1' # what IP address(es) to listen on;
# comma-separated list of addresses;
# defaults to 'localhost'; use '*' for all
# (change requires restart)
port = 5432 # (change requires restart)
max_connections = 30 # (change requires restart)

Now ready to start the PostgreSQL server:
fpa@dell-fpa ~
$ /usr/sbin/pg_ctl -D /usr/share/postgresql/data -l log.txt start
server starting

Username

My Windows username is ‘FPA’ and so is the Cygwin user which started the database server and I check that I can connect to the maintenance database with this user:

fpa@dell-fpa ~
$ psql -U fpa postgres
psql (9.6.2)
Type "help" for help.
 
postgres=# \du
 
List of roles
Role name | Attributes | Member of
-----------+------------------------------------------------------------+-----------
fpa | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
 
postgres=# quit

PgAdmin

As I am on Windows, I install the graphical console PgAdmin and setup the connection to this database:
CapturePGCY0003

SQL Developer

As an Oracle fan, I prefer to connect with SQL Developer. Just download the JDBC driver for PostgreSQL: https://jdbc.postgresql.org/download.html

In SQL Developer you can declare this .jar from Tools -> Preferences -> Third Party JDBC Drivers

CapturePGCY0004

And create the connection with the new ‘PostgreSQL’ tab:

CapturePGCY0005
Then with ‘Choose Database’ you can fill the dropbox and choose the database you want to connect to.

As I have no database with the same name as the username, I have to mention the database name at the end of the hostname, suffixed with ‘?’ to get the proper JDBC url. And what you put in the dropbox will be ignored. I don’t really know the reason, but this is how I got the correct url.

CapturePGCY0006

Extensions

You can install extensions. For example, I’ve installed pg_hint_plan to be able to hint the access path and join methods.

wget https://osdn.net/dl/pghintplan/pg_hint_plan96-1.2.1.tar.gz
tar -zxvf pg_hint_plan96-1.2.1.tar.gz
cd pg_hint_plan96-1.2.1
make
make install

And I’m now able to load it:

$ psql
psql (9.6.2)
Type "help" for help.
 
fpa=# load 'pg_hint_plan';
LOAD

But Why?

You may wonder why I don’t install it directly on Linux. My laptop is on Windows and, of course, I have a lot of VirtualBox VMs. But this doesn’t require to start a VM.
You may wonder why I don’t install the Windows version? I want to investigate the linux behaviour. And I may want to trace the postgres processes. For example, cygwin has a strace.exe which shows similar output as strace on Linux. Here is the I/O calls from a full table scan (Seq Scan):
CaptureStraceCygwinPostgres
I can see that postgres sequential reads are done through one lseek() and sequential 8k read().

This was simple. Just get the pid of the session process:

fpa=# select pg_backend_pid();
pg_backend_pid
----------------
11960

and strace it:

$ strace -p 11960

I’ve done that in about one hour: download, install, setup and write this blog post. Without any virtual machine, you can have a Linux Postgres database server running on Windows.

 

Cet article PostgreSQL on Cygwin est apparu en premier sur Blog dbi services.

Uruguay User Group Session: Secure Your Database in 1 Day

Many thanks to those who I had the pleasure and honor of presenting, albeit virtually, on how to secure 60% of an Oracle Database infrastructure in just under a day. I hope you found it useful.

You can download the presentation here and the scripts here.

Upgrading an Amazon EC2 Delphix Source, Part III

This is the Part III in a four part series on how to:

  1.  Enable VNC Viewer access on Amazon EC2 hosts.
  2.  Install DB12c and upgrade a Dsource for Delphix from 11g to 12c, (12.1)
  3.  Update the Delphix Configuration to point to the newly upgraded 12c database and the new Oracle 12c home.
  4.  Install DB12c and upgrade target VDBs for Delphix residing on AWS to 12.1 from the newly upgraded source.

In Part II, we finished upgrading the Dsource database, but now we need to get it configured on the Delphix side.

Log into the Delphix Admin console to make the changes required to recognize the Dsource is now DB12c and has a new Oracle home.

Log into the Delphix console as the Delphix_Admin user and go to the Manage –> Environments.

Click on the Refresh button and let the system recognize the new Oracle Home for DB12c:

https://i0.wp.com/dbakevlar.com/wp-content/uploads/2017/07/Screen-Shot-2... 150w, https://i0.wp.com/dbakevlar.com/wp-content/uploads/2017/07/Screen-Shot-2... 300w" sizes="(max-width: 400px) 100vw, 400px" data-recalc-dims="1" />

Once complete, you should see the 12.1 installation we performed on the Linux Source now listed in the Environments list.

Click on Manage –> Datasets and find the Dsource 11g database and click on it.

Click on the Configuration tab and click on the Upgrade icon, (a small up arrow in the upper right.)

https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/07/Screen-Shot-2... 300w, https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/07/Screen-Shot-2... 768w, https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/07/Screen-Shot-2... 1200w, https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/07/Screen-Shot-2... 1800w" sizes="(max-width: 600px) 100vw, 600px" data-recalc-dims="1" />

Update to the new Oracle Home that will now be listed in the dropdown and scroll down to save.

Now click on the camera icon to take a snap sync to ensure everything is functioning properly.  This should only take a minute to complete.

The DSource is now updated in the Delphix Admin console and we can turn our attentions to the Linux target and our VDBs that source from this host.  In Part IV we’ll dig into the other half of the source/target configuration and how I upgraded Delphix environments with a few surprises!

 



Tags:  , , ,


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBA Kevlar [Upgrading an Amazon EC2 Delphix Source, Part III], All Right Reserved. 2017.

The post Upgrading an Amazon EC2 Delphix Source, Part III appeared first on DBA Kevlar.

Installation overview of node_exporter, prometheus and grafana

Prometheus is an open source systems monitoring and alerting toolkit originally build at Soundcloud. This blogpost shows how to install the needed components to do visualisation of linux system statistics via Grafana.

The setup consists of 3 components:
node_exporter, an exporter of system and hardware metrics.
prometheus, a metric collection and persistence layer.
grafana, the visualisation layer.

1. Preparation
The needed components are installed in the home directory of the user ‘prometheus’. In order for that user exist, it must obviously first be created:

# useradd prometheus
# su - prometheus
$

This installation guide uses Oracle Linux 7.3, but should work for RHEL or Centos too.

2. Node exporter
The next thing to do is install the node exporter. Please mind new version do come out, so you might want to verify the latest release on

$ curl -LO "https://github.com/prometheus/node_exporter/releases/download/v0.14.0/node_exporter-0.14.0.linux-amd64.tar.gz"
$ mkdir -p Prometheus/node_exporter
$ cd $_
$ tar xzf ../../node_exporter-0.14.0.linux-amd64.tar.gz

Now become root and create a unit file to automatically startup the node exporter using systemd:

# echo "[Unit]
Description=Node Exporter

[Service]
User=prometheus
ExecStart=/home/prometheus/Prometheus/node_exporter/node_exporter-0.14.0.linux-amd64/node_exporter

[Install]
WantedBy=default.target" > /etc/systemd/system/node_exporter.service

And make systemd start the node exporter:

# systemctl daemon-reload
# systemctl enable node_exporter.service
# systemctl start node_exporter.service

Next you can verify if the node exporter is running by using ‘systemctl status node_exporter.service:

# systemctl status node_exporter.service
● node_exporter.service - Node Exporter
   Loaded: loaded (/etc/systemd/system/node_exporter.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2017-07-31 15:20:54 UTC; 7s ago
 Main PID: 3017 (node_exporter)
   CGroup: /system.slice/node_exporter.service
           └─3017 /home/prometheus/Prometheus/node_exporter/node_exporter-0.14.0.linux-amd64/node_exporter

Jul 31 15:20:54 test.local node_exporter[3017]: time="2017-07-31T15:20:54Z" level=info msg=" - hwmon" source="node_exporter.go:162"
Jul 31 15:20:54 test.local node_exporter[3017]: time="2017-07-31T15:20:54Z" level=info msg=" - infiniband" source="node_exporter.go:162"
Jul 31 15:20:54 test.local node_exporter[3017]: time="2017-07-31T15:20:54Z" level=info msg=" - textfile" source="node_exporter.go:162"
Jul 31 15:20:54 test.local node_exporter[3017]: time="2017-07-31T15:20:54Z" level=info msg=" - conntrack" source="node_exporter.go:162"
Jul 31 15:20:54 test.local node_exporter[3017]: time="2017-07-31T15:20:54Z" level=info msg=" - diskstats" source="node_exporter.go:162"
Jul 31 15:20:54 test.local node_exporter[3017]: time="2017-07-31T15:20:54Z" level=info msg=" - entropy" source="node_exporter.go:162"
Jul 31 15:20:54 test.local node_exporter[3017]: time="2017-07-31T15:20:54Z" level=info msg=" - loadavg" source="node_exporter.go:162"
Jul 31 15:20:54 test.local node_exporter[3017]: time="2017-07-31T15:20:54Z" level=info msg=" - sockstat" source="node_exporter.go:162"
Jul 31 15:20:54 test.local node_exporter[3017]: time="2017-07-31T15:20:54Z" level=info msg=" - wifi" source="node_exporter.go:162"
Jul 31 15:20:54 test.local node_exporter[3017]: time="2017-07-31T15:20:54Z" level=info msg="Listening on :9100" source="node_exporter.go:186"

Additionally, you can go to hostname:9100, and look if that page says ‘node exporter’, and has a link called ‘metric’, which has all the metrics.

3. Prometheus
After we installed node_exporter to provide measurements, we must install the software that can fetch that information and store it. That is what prometheus does. First, become the prometheus user again, and install prometheus. Here too is important to realise that newer versions will come out after this article has been written:

# su - prometheus
$ curl -LO "https://github.com/prometheus/prometheus/releases/download/v1.7.1/prometheus-1.7.1.linux-amd64.tar.gz"
$ cd Prometheus
$ tar xzf ../prometheus-1.7.1.linux-amd64.tar.gz
$ cd prometheus-1.7.1.linux-amd64
$ echo "scrape_configs:

  - job_name: 'prometheus'
    scrape_interval: 1s
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'node_exporter'
    scrape_interval: 1s
    static_configs:
      - targets: ['localhost:9100']"> prometheus.yml

This downloaded and unzipped prometheus, and created prometheus scrape config to fetch data from prometheus itself and the node exporter. Now become root, and install the systemd unit file for prometheus:

# echo "[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target

[Service]
User=prometheus
Restart=on-failure
ExecStart=/home/prometheus/Prometheus/prometheus-1.7.1.linux-amd64/prometheus -config.file=/home/prometheus/Prometheus/prometheus-1.7.1.linux-amd64/prometheus.yml -storage.local.path=/home/prometheus/Prometheus/prometheus-1.7.1.linux-amd64/data

[Install]
WantedBy=multi-user.target" > /etc/systemd/system/prometheus.service

And make systemd start prometheus:

# systemctl daemon-reload
# systemctl enable prometheus.service
# systemctl start prometheus.service

And verify prometheus is running:

# systemctl status prometheus.service
● prometheus.service - Prometheus Server
   Loaded: loaded (/etc/systemd/system/prometheus.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2017-07-31 15:36:55 UTC; 9s ago
     Docs: https://prometheus.io/docs/introduction/overview/
 Main PID: 22656 (prometheus)
   CGroup: /system.slice/prometheus.service
           └─22656 /home/prometheus/Prometheus/prometheus-1.7.1.linux-amd64/prometheus -config.file=/home/prometheus/Prometheus/prometheus-1.7.1....

Jul 31 15:36:55 test.local systemd[1]: Started Prometheus Server.
Jul 31 15:36:55 test.local systemd[1]: Starting Prometheus Server...
Jul 31 15:36:55 test.local prometheus[22656]: time="2017-07-31T15:36:55Z" level=info msg="Starting prometheus (version=1.7.1, branch=mast...n.go:88"
Jul 31 15:36:55 test.local prometheus[22656]: time="2017-07-31T15:36:55Z" level=info msg="Build context (go=go1.8.3, user=root@0aa1b7fc43...n.go:89"
Jul 31 15:36:55 test.local prometheus[22656]: time="2017-07-31T15:36:55Z" level=info msg="Host details (Linux 3.10.0-514.26.2.el7.x86_64 ...n.go:90"
Jul 31 15:36:55 test.local prometheus[22656]: time="2017-07-31T15:36:55Z" level=info msg="Loading configuration file /home/prometheus/Pro....go:252"
Jul 31 15:36:55 test.local prometheus[22656]: time="2017-07-31T15:36:55Z" level=info msg="Loading series map and head chunks..." source="....go:428"
Jul 31 15:36:55 test.local prometheus[22656]: time="2017-07-31T15:36:55Z" level=info msg="0 series loaded." source="storage.go:439"
Jul 31 15:36:55 test.local prometheus[22656]: time="2017-07-31T15:36:55Z" level=info msg="Starting target manager..." source="targetmanager.go:63"
Jul 31 15:36:55 test.local prometheus[22656]: time="2017-07-31T15:36:55Z" level=info msg="Listening on :9090" source="web.go:259"
Hint: Some lines were ellipsized, use -l to show in full.

Additionally you can go to hostname:9090/targets and verify both node_exporter and prometheus report state=UP.

At this point, system metrics are fetched and stored. All we need to do, is visualise it. An excellent tool for doing so is grafana. This is how grafana is installed:

4. Grafana
This webpage shows installation instructions and a link to the latest version. During the time of writing of this blogpost, the latest version was 4.1.1. This is how grafana is installed: (please mind installation and systemd require root privileges)

# yum install https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-4.4....

Next up make systemd handle grafana and start it:

# systemctl daemon-reload
# systemctl enable grafana-server.service
# systemctl start grafana-server.service

And check if grafana is running:

# systemctl status grafana-server.service
● grafana-server.service - Grafana instance
   Loaded: loaded (/usr/lib/systemd/system/grafana-server.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2017-07-31 15:43:11 UTC; 1min 58s ago
     Docs: http://docs.grafana.org
 Main PID: 22788 (grafana-server)
   CGroup: /system.slice/grafana-server.service
           └─22788 /usr/sbin/grafana-server --config=/etc/grafana/grafana.ini --pidfile= cfg:default.paths.logs=/var/log/grafana cfg:default.path...

Jul 31 15:43:12 test.local grafana-server[22788]: t=2017-07-31T15:43:12+0000 lvl=info msg="Starting plugin search" logger=plugins
Jul 31 15:43:12 test.local grafana-server[22788]: t=2017-07-31T15:43:12+0000 lvl=warn msg="Plugin dir does not exist" logger=plugins dir=/...plugins
Jul 31 15:43:12 test.local grafana-server[22788]: t=2017-07-31T15:43:12+0000 lvl=info msg="Plugin dir created" logger=plugins dir=/var/lib...plugins
Jul 31 15:43:12 test.local grafana-server[22788]: t=2017-07-31T15:43:12+0000 lvl=info msg="Initializing Alerting" logger=alerting.engine
Jul 31 15:43:12 test.local grafana-server[22788]: t=2017-07-31T15:43:12+0000 lvl=info msg="Initializing CleanUpService" logger=cleanup
Jul 31 15:43:12 test.local grafana-server[22788]: t=2017-07-31T15:43:12+0000 lvl=info msg="Initializing Stream Manager"
Jul 31 15:43:12 test.local grafana-server[22788]: t=2017-07-31T15:43:12+0000 lvl=info msg="Initializing HTTP Server" logger=http.server ad...socket=
Jul 31 15:44:34 test.local grafana-server[22788]: t=2017-07-31T15:44:34+0000 lvl=info msg="Request Completed" logger=context userId=0 orgI...eferer=
Jul 31 15:44:34 test.local grafana-server[22788]: t=2017-07-31T15:44:34+0000 lvl=info msg="Request Completed" logger=context userId=0 orgI...eferer=
Jul 31 15:44:34 test.local grafana-server[22788]: t=2017-07-31T15:44:34+0000 lvl=info msg="Request Completed" logger=context userId=0 orgI...eferer=
Hint: Some lines were ellipsized, use -l to show in full.

5. Grafana configuration
Next, we need to hook up grafana with prometheus. First, go to hostname:3000.
– Login with admin/admin
– Click ‘add datasource’
– Name: prometheus, Type: Prometheus
– Http settings: http://localhost:9090, select Access: ‘proxy’.
– Click ‘save and test’. This should result in ‘success’ and ‘datasource updated.’

Now click on the grafana symbol in the left upper corner, dashboards, import. Enter ‘2747’ at ‘grafana.com dashboard’. This will say ‘Linux memory’, select the prometheus datasource which you just defined, and click import.

This should result in a dashboard the shows you the linux memory area’s (click on the picture to get a better view!):
https://fritshoogland.files.wordpress.com/2017/07/grafana-with-prometheu... 598w, https://fritshoogland.files.wordpress.com/2017/07/grafana-with-prometheu... 150w" sizes="(max-width: 300px) 100vw, 300px" />

Tagged: dashboard, grafana, installation, linux, memory, node exporter, prometheus

Topological sorting in PLSQL

For my testdata_ninja library, I recently created the field auto reference feature, where
a value from one field can be used as an input to a generator for another field. Part of this feature involves
parsing all columns and building a dependency list (Directed Acyclic Graph)
sorting all columns correctly. Since everything in the library is build purely in plsql, I don't store anything
in tables. So if my data is not in a table, I cannot use sql as one would normally do it. I couldn't find any
code out there that did what I wanted, so like much of the other stuff I do, I had to build it myself.