Search

Top 60 Oracle Blogs

Recent comments

Postgres

Full page logging in Postgres and Oracle

In my opinion, the volume of logging (aka redo log, aka xlog, aka WAL) is the most important factor for OLTP performance, availability and scalability, for several reasons:

  • This is the only structure where disk latency is a mandatory component of response time
  • This is a big part of the total volume of backups
  • This is sequential by nature, and very difficult to scale by parallelizing

In this post, I look at the volume of logging generated by some DML in Postgres and Oracle. I know Oracle quite well and just start to look at Postgres. The comparison here is not a contest but a way to better understand. For example, the default behavior of Postgres, with full_page_writes=on, is very similar to Oracle ‘begin backup’ mode. The comparison makes no sense for most of Postgres DBAs, but probably helps Oracle DBAs to understand it.

Server process name in Postgres and Oracle

Every database analysis should start with system load analysis. If the host is in CPU starvation, then looking at other statistics can be pointless. With ‘top’ on Linux, or equivalent such as process explorer on Windows, you see the process (and threads). If the name of the process is meaningful, you already have a clue about the active sessions. Postgres goes further by showing the operation (which SQL command), the state (running or waiting), and the identification of the client.

Postgres

By default ‘top’ displays the program name (like ‘comm’ in /proc or in ‘ps’ format), which will be ‘postgres’ for all PostgreSQL processes. But you can also display the command line with ‘c’ in interactive mode, or directly starting with ‘top -c’, which is the same as the /proc/$pid/cmdline or ‘cmd’ or ‘args’ in ‘ps’ format.

Postgres vs. Oracle access paths XI – Sample Scan

I was going to end this series with the previous post because the last access path available in Postgres is a bit special: a Seq Scan that returns only a sample of the rows, at random. However, it is the occasion to come back to the difference between random and sequential reads.

I’m still working on the same table as in the previous posts, with 10000 rows in 1429 pages. 5% of rows is 500 rows and 5% of blocks is about 72 pages.

Postgres vs. Oracle access paths X – Update

In the previous post we have seen the cheapest way to get one row, reading only one block from its physical location. But that’s the optimal case where the row has not moved. I’ll (nearly) conclude this series about access path with an update.

Postgres vs. Oracle access paths IX – Tid Scan

In the previous post we have seen how Postgres and Oracle finds the table row from the index entry. It uses the TID / ROWID. I’ll focus on this access path and I will have covered all Postgres access paths to table data.

Oracle ACCESS BY ROWID

I start with Oracle because we already have seen the TABLE ACCESS BY ROWID. I’ll decompose an index acces to the table. The first step is getting the ROWID from the index entry:

SQL> select /*+ */ rowid from demo1 where n=1000;
 
ROWID
------------------
AAASPkAAMAAABIaAAF

Postgres vs. Oracle access paths VIII – Index Scan and Filter

In the previous post we have seen a nice optimization to lower the consequences of bad correlation between the index and the table physical order: a bitmap, which may include false positives and then requires a ‘recheck’ of the condition, but with the goal to read each page only once. Now we are back to the well-clustered table where we have seen two possible access paths: IndexOnlyScan when all columns we need are in the index, and IndexScan when we select additional columns. Here is a case in the middle: the index does not have all the columns required by the select, but can eliminate all rows.

The table created is:

create table demo1 as select generate_series n , 1 a , lpad('x',1000,'x') x from generate_series(1,10000);
SELECT 10000
create unique index demo1_n on demo1(n);
CREATE INDEX

Postgres vs. Oracle access paths VI – Index Scan

In the previous post my queries were still reading the indexed column only, from a table which had no modifications since the last vacuum, and then didn’t need to read table pages: it was Index Only Scan. However, we often need more columns than the ones that are in the index. Here is the Index Scan access path.

Postgres vs. Oracle access paths V – FIRST ROWS and MIN/MAX

We have seen how an index can help to avoid a sorting operation in the previous post. This avoids a blocking operation: the startup cost is minimal and the first rows can be immediately returned. This is often desired when displaying rows to the user screen. Here is more about Postgres startup cost, Oracle first_rows costing, and fetching first rows only.

Here is the execution plan we had in Oracle to get the values of N sorted. The cost for Oracle is the cost to read the index leaves: estimated to 46 random reads:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID dbck3rgnqbakg, child number 0
-------------------------------------

Postgres vs. Oracle access paths IV – Order By and Index

I realize that I’m talking about indexes in Oracle and Postgres, and didn’t mention yet the best website you can find about indexes, with concepts and examples for all RDBMS: http://use-the-index-luke.com. You will probably learn a lot about SQL design. Now let’s continue on execution plans with indexes.

Postgres vs. Oracle access paths III – Partial Index

In the previous post I said that an Index Only Access needs to find all rows in the index. Here is a case where, with similar data, Postgres can find all rows but Oracle needs additional considerations.

In the previous post I’ve executed:
select sum(n) from demo1
The execution plan was:

Aggregate (cost=295.29..295.30 rows=1 width=8) (actual time=2.192..2.193 rows=1 loops=1)
Output: sum(n)
Buffers: shared hit=30
-> Index Only Scan using demo1_n on public.demo1 (cost=0.29..270.29 rows=10000 width=4) (actual time=0.150..1.277 rows=10000 loops=1)
Output: n
Heap Fetches: 0
Buffers: shared hit=30