Parsing … no big deal eh ?

Most of us have probably seen the standard demo when it comes to emphasizing the need for sharable SQL, aka, using bind variables where appropriate.  The demo traditionally compares two similar scripts, where one of them generates a lot of SQL statements with literals, and the other recasts the same script with bind variables for dramatic improvement.

Here’s a simple version I’ve whipped up:


SQL> create table t ( x int primary key) organization index;

Table created.

SQL> insert into t
  2  select rownum from dual
  3  connect by level <= 100000;

100000 rows created.

SQL> exec dbms_stats.gather_table_stats('','T');

PL/SQL procedure successfully completed.

SQL> set timing on
SQL> declare
  2    v int;
  3  begin
  4    for i in 1 .. 100000 loop
  5      select x into v from t where x = i;
  6    end loop;
  7  end;
  8  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:01.56

SQL> set timing on
SQL> declare
  2    v int;
  3  begin
  4    for i in 1 .. 100000 loop
  5      execute immediate 'select x from t where x = '||i into v;
  6    end loop;
  7  end;
  8  /

PL/SQL procedure successfully completed.

Elapsed: 00:01:04.92

So typically what follows is a “Woo Hoo! Look how awesome bind variables are, and look how crappy the literal SQL is” style of statement, with much hand waving and pointing in the direction of the elapsed time.

But lately, there has been people happy to provide a rebuttal, namely, “So what?”.  The argument runs like this:

In the “terrible” example, we are performing approximately 1600 statements per second.  Is that anywhere near as good as the binding case ?  No…but who cares.  There’s plenty of systems out there for which 16 statements per second would be fine, let alone 1600.  So is all the fuss about binding really justified?

Whilst such a rebuttal falls to pieces on other grounds – such as latch contention once we introduce multi-user scenarios, and probably more importantly in today’s security world – the risk of SQL injection, I’m not going to re-hash those.  I want to look at the claim of “Oh… the performance is good enough anyway”.

We seem to have forgotten, then when the first such “bind versus literals” scripts were being produced and published, it was a different era.  Personally, in the early 90’s, my Oracle database (version 7) was running on a Sparcstation 10, with it’s tremendously powerful and incredibly expensive, 50 Mhz processor !!!! Smile (For a trip down memory lane, see here https://en.wikipedia.org/wiki/SPARCstation_10 )

 

So in those days, the differential between binding and literals was massive, because processor cycles were incredibly valuable.  But we tend to forget, that just as processors have evolved, so too have our requirements to access data.  Yes, it’s true that a modern processor can probably easily handle those literal-based simple primary key lookup queries at a rate that meets our needs.  But that’s not always what a “modern” query looks like.  Nowadays, queries have evolved just like the processors – they can be much more complicated.  What happens to the parsing costs then ? So let’s bring some complexity into the fold, and re-run our test with some more complicated queries.

Obviously a complicated query might have a long execution time, so we’ll take that out of the equation by using DBMS_SQL to only parse (and not execute) the query.  We’ll use a couple of dictionary views for our query, which themselves comprise several dictionary views, so there’s plenty of complexity even though the query is just a simple join.


SQL> set timing on
SQL> declare
  2    c int;
  3  begin
  4   c := dbms_sql.open_cursor;
  5   for i in 1 .. 100000 loop
  6      dbms_sql.parse(c,
  7        q'{
  8        select o.object_name, o.last_ddl_time, sum(s.bytes)
  9        from   all_objects o,
 10               dba_segments s
 11        where  o.owner = s.owner
 12        and    o.object_name = s.segment_name
 13        and    o.object_type = 'TABLE'
 14        and    o.object_id = :1
 15        group by o.object_name, o.last_ddl_time
 16        }',
 17        dbms_sql.native );
 18    end loop;
 19    dbms_sql.close_cursor(c);
 20  end;
 21  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:01.14

SQL> declare
  2    c int;
  3  begin
  4   c := dbms_sql.open_cursor;
  5   for i in 1 .. 100000 loop
  6      dbms_sql.parse(c,
  7        q'{
  8        select o.object_name, o.last_ddl_time, sum(s.bytes)
  9        from   all_objects o,
 10               dba_segments s
 11        where  o.owner = s.owner
 12        and    o.object_name = s.segment_name
 13        and    o.object_type = 'TABLE'
 14        and    o.object_id = }'||i||
 15        q'{
 16        group by o.object_name, o.last_ddl_time
 17        }',
 18        dbms_sql.native );
 19    end loop;
 20    dbms_sql.close_cursor(c);
 21  end;
 22  /

PL/SQL procedure successfully completed.

Elapsed: 10:07:12.71


Yes…you are reading that right. Over 10 hours to get through all of that parsing load.

So just because the processors are bigger and smarter….don’t forget, for high volume calls, those literals are literally going to hurt your systems Smile