In an earlier post I’ve described how a distributed query can operate at a remote site if it’s a simple select but has to operate at the local site if it’s a CTAS (create as select) or insert as select. There’s (at least) one special case where this turns out to be untrue … provided you write the query in the correct fashion. I discovered this only as a result of doing a few experiments in response to a question on the OTN database forum.
Here’s a little demonstration, cut-n-pasted with a little cosmetic editing from an 11gR1 SQL*Plus session:
SQL> create table t1 as select * from all_objects where rownum <= 20000; Table created. SQL> create table t2 as select * from all_objects where rownum <= 10000; Table created. SQL> create table t3 as select * from all_objects where rownum < 1; Table created. SQL> set autotrace traceonly explain SQL> insert into t3 2 select * from t1@d11g@loopback 3 minus 4 select * from t2@d11g@loopback 5 ; 10000 rows created. Execution Plan ---------------------------------------------------------- Plan hash value: 1945526954 --------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Inst |IN-OUT| --------------------------------------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | 23524 | 5331K| | 1137 (33)| 00:00:06 | | | | 1 | LOAD TABLE CONVENTIONAL | T3 | | | | | | | | | 2 | MINUS | | | | | | | | | | 3 | SORT UNIQUE | | 23524 | 3629K| 4496K| 773 (2)| 00:00:04 | | | | 4 | REMOTE | T1 | 23524 | 3629K| | 40 (5)| 00:00:01 | D11G@~ | R->S | | 5 | SORT UNIQUE | | 11027 | 1701K| 2112K| 364 (2)| 00:00:02 | | | | 6 | REMOTE | T2 | 11027 | 1701K| | 19 (6)| 00:00:01 | D11G@~ | R->S | --------------------------------------------------------------------------------------------------------- Remote SQL Information (identified by operation id): ---------------------------------------------------- 4 - SELECT "OWNER","OBJECT_NAME","SUBOBJECT_NAME","OBJECT_ID","DATA_OBJECT_ID","OBJECT_TYPE"," CREATED","LAST_DDL_TIME","TIMESTAMP","STATUS","TEMPORARY","GENERATED","SECONDARY","NAMESPACE","ED ITION_NAME" FROM "T1" "T1" (accessing 'D11G@LOOPBACK' ) 6 - SELECT "OWNER","OBJECT_NAME","SUBOBJECT_NAME","OBJECT_ID","DATA_OBJECT_ID","OBJECT_TYPE"," CREATED","LAST_DDL_TIME","TIMESTAMP","STATUS","TEMPORARY","GENERATED","SECONDARY","NAMESPACE","ED ITION_NAME" FROM "T2" "T2" (accessing 'D11G@LOOPBACK' )
Not good – to get the 10000 rows I wanted in my result set Oracle fetched everything from the remote t1 (I’ve used my usually “connection qualifier” trick to make the same database appear to be a second database at the end of a datababase link), sorted the result locally, fetched everything from the remote t2, sorted the result locally, then found the difference. I want the remote system to do both fetches and sorts and send only the final result of 10000 rows.
But here’s a silly little trick that makes a huge difference:
insert into t3 select * from ( select * from t1@d11g@loopback minus select * from t2@d11g@loopback ) ; Plan hash value: 1788691278 ----------------------------------------------------------------- | Id | Operation | Name | Cost | Inst |IN-OUT| ----------------------------------------------------------------- | 0 | INSERT STATEMENT | | 1 | | | | 1 | LOAD TABLE CONVENTIONAL | | | | | | 2 | REMOTE | | | D11G@~ | R->S | ----------------------------------------------------------------- Remote SQL Information (identified by operation id): ---------------------------------------------------- 2 - SELECT "A1"."OWNER","A1"."OBJECT_NAME","A1"."SUBOBJECT_NAME","A1" ."OBJECT_ID","A1"."DATA_OBJECT_ID","A1"."OBJECT_TYPE","A1"."CREATED","A1 "."LAST_DDL_TIME","A1"."TIMESTAMP","A1"."STATUS","A1"."TEMPORARY","A1"." GENERATED","A1"."SECONDARY","A1"."NAMESPACE","A1"."EDITION_NAME" FROM ( (SELECT "A4"."OWNER" "OWNER","A4"."OBJECT_NAME" "OBJECT_NAME","A4"."SUBOBJECT_NAME" "SUBOBJECT_NAME","A4"."OBJECT_ID" "OBJECT_ID","A4"."DATA_OBJECT_ID" "DATA_OBJECT_ID","A4"."OBJECT_TYPE" "OBJECT_TYPE","A4"."CREATED" "CREATED","A4"."LAST_DDL_TIME" "LAST_DDL_TIME","A4"."TIMESTAMP" "TIMESTAMP","A4"."STATUS" "STATUS","A4"."TEMPORARY" "TEMPORARY","A4"."GENERATED" "GENERATED","A4"."SECONDARY" "SECONDARY","A4"."NAMESPACE" "NAMESPACE","A4"."EDITION_NAME" "EDITION_NAME" FROM "T1" "A4")MINUS (SELECT "A3"."OWNER" "OWNER","A3"."OBJECT_NAME" "OBJECT_NAME","A3"."SUBOBJECT_NAME" "SUBOBJECT_NAME","A3"."OBJECT_ID" "OBJECT_ID","A3"."DATA_OBJECT_ID" "DATA_OBJECT_ID","A3"."OBJECT_TYPE" "OBJECT_TYPE","A3"."CREATED" "CREATED","A3"."LAST_DDL_TIME" "LAST_DDL_TIME","A3"."TIMESTAMP" "TIMESTAMP","A3"."STATUS" "STATUS","A3"."TEMPORARY" "TEMPORARY","A3"."GENERATED" "GENERATED","A3"."SECONDARY" "SECONDARY","A3"."NAMESPACE" "NAMESPACE","A3"."EDITION_NAME" "EDITION_NAME" FROM "T2" "A3")) "A1" (accessing 'D11G@LOOPBACK' )
By wrapping the required query into an inline view – which, presumably, the optimizer considers to be a non-mergeable view – we find that the entire query is sent to remote database for operation and only the minimum result set returned.
Of course this isn’t a distributed query (it’s a “fully remote” query) so I was a little surprised when Oracle didn’t execute the original form remotely, then I was pleasantly surprised when I found that a simple little trick solved the problem. Unfortunately you’ll find that the problem on OTN was a little harder – Oracle didn’t like this type of query at all if one of the subqueries involved an “AS OF TIMESTAMP” clause.
Recent comments
1 year 45 weeks ago
2 years 5 weeks ago
2 years 9 weeks ago
2 years 10 weeks ago
2 years 15 weeks ago
2 years 36 weeks ago
3 years 4 weeks ago
3 years 34 weeks ago
4 years 18 weeks ago
4 years 18 weeks ago