Generating INSERT scripts that run fast!

May 17, 2019

6 minutes

Most developers tools out there have some mechanism to unload a table into a flat file, either as CSV, or Excel, and some even allow you to unload the data as INSERT statements. The latter is pretty cool because it’s a nice way of having a self-contained file that does not need Excel or DataPump or any tool additional to the one you’re probably using to unload the data.

SQLcl and SQL Developer are perhaps the easiest to utilize for such an extract. You simply add the pseudo-hint INSERT to get the output as insert statements. For example:


SQLcl: Release 18.4 Production on Wed Apr 17 17:05:49 2019

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Last Successful login time: Wed Apr 17 2019 17:05:49 +08:00

Connected to:
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Version 18.6.0.0.0

SQL> select /*insert*/ * from t where rownum <= 10;
REM INSERTING into T
SET DEFINE OFF;
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_FILE#_BLOCK#','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_OBJ3','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_TS1','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_CON1','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','IND ...

That’s pretty cool but no matter what tool you use, ultimately you end up with a set of insert statements that will load 1 row at a time into the database. For a small number of rows, that is no problem. But for larger numbers of rows then you might sitting around for a long time waiting for that script to finish. There is two things that are going to slow us down here.

1) Parsing

Every insert is a brand new statement since it contains literal values. Even for a simple INSERT, that’s a lot of parsing work for the database. This is easily solved with a couple of ALTER SESSION wrappers at the head and tail of the script.


alter session set cursor_sharing = force;
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_FILE#_BLOCK#','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_OBJ3','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_TS1','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_CON1','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','IND

That will replace the literals with bind variable values and hence all of the INSERTs will be sharable.

2) Row by Row

A set of INSERT statements is inherently a row by row operation. One INSERT = one row. Of course, we developers have all had drummed into our programming minds over the years that database operations should be done in sets not row by row. But how can we get to that utopia with set of insert statements? We could totally rewrite the script with bind arrays and the like, but that’s a big job. So here is my quick hack to get convert single row inserts into multiple row inserts without too much effort.

A set of individual INSERTS such as


Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_FILE#_BLOCK#','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_OBJ3','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_TS1','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_CON1','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','IND...

can also be written as a multi-table INSERT, where all insert targets are the same table and all WHEN conditions are omitted and hence are implicitly true. The above set of INSERTs can be phrased as a single statement like this:


insert all
into t(OWNER,OBJECT_NAME,OBJECT_TYPE)  values (q'{SYS}',q'{I_FILE#_BLOCK#}',q'{INDEX}')
into t(OWNER,OBJECT_NAME,OBJECT_TYPE)  values (q'{SYS}',q'{I_OBJ3}',q'{INDEX}')
into t(OWNER,OBJECT_NAME,OBJECT_TYPE)  values (q'{SYS}',q'{I_TS1}',q'{INDEX}')
into t(OWNER,OBJECT_NAME,OBJECT_TYPE)  values (q'{SYS}',q'{I_CON1}',q'{INDEX}')
into t(OWNER,OBJECT_NAME,OBJECT_TYPE)  values (q'{SYS}',q'{IND$}',q'{TABLE}')
select * from dual;

You could write some awk/sed etc to take an existing INSERT script and make it into a multiple table one. I took a slightly different approach and whipped up some PL/SQL to allow pipelining the appropriate multi-table inserts out to a spool file. Here is the function I wrote to do it.


create or replace function as_insert(p_query varchar2, p_batch int default 10) return sys.odcivarchar2list pipelined as
    l_theCursor     integer default dbms_sql.open_cursor;
    l_columnValue   varchar2(4000);
    l_status        integer;
    l_descTbl       dbms_sql.desc_tab;
    l_colCnt        number;
    n number := 0;
    
    l_tname varchar2(200) := substr(p_query,instr(p_query,' ',-1,1)+1);
    l_collist varchar2(32000);
    l_colval varchar2(32000);
    l_dml varchar2(32000);
    
    l_nls sys.odcivarchar2list := sys.odcivarchar2list();
    
begin
   if l_tname is null then l_tname := '@@TABLE@@'; end if;

   select value
   bulk collect into l_nls
   from v$nls_parameters
   where parameter in (   
      'NLS_DATE_FORMAT',
      'NLS_TIMESTAMP_FORMAT',
      'NLS_TIMESTAMP_TZ_FORMAT')
   order by parameter;

    execute immediate 'alter session set nls_date_format=''yyyy-mm-dd hh24:mi:ss'' ';
    execute immediate 'alter session set nls_timestamp_format=''yyyy-mm-dd hh24:mi:ssff'' ';
    execute immediate 'alter session set nls_timestamp_tz_format=''yyyy-mm-dd hh24:mi:ssff tzr'' ';

    dbms_sql.parse(  l_theCursor,  p_query, dbms_sql.native );
    dbms_sql.describe_columns( l_theCursor, l_colCnt, l_descTbl );

    for i in 1 .. l_colCnt loop
        dbms_sql.define_column(l_theCursor, i, l_columnValue, 4000);
        l_collist := l_collist || l_descTbl(i).col_name||',';
    end loop;
    l_collist := 'into '||l_tname||'('||rtrim(l_collist,',')||')';

    l_status := dbms_sql.execute(l_theCursor);

    pipe row('alter session set cursor_sharing = force;');
    while ( dbms_sql.fetch_rows(l_theCursor) > 0 ) loop
       n := n + 1;
  
       if mod(n,p_batch) = 1 then
          pipe row('insert all ');
       end if;
       
        for i in 1 .. l_colCnt loop
            dbms_sql.column_value( l_theCursor, i, l_columnValue );
            if l_columnValue is null then
              l_colval := l_colval || 'null,';
            elsif l_descTbl(i).col_type in (1,8,9,96,112) then
              l_colval := l_colval || 'q''{'||l_columnValue ||'}''' || ',';
            elsif l_descTbl(i).col_type in (2,100,101) then
              l_colval := l_colval || l_columnValue || ',';
            elsif l_descTbl(i).col_type in (12) then
              l_colval := l_colval || 'to_date('''||l_columnValue||''',''yyyy-mm-dd hh24:mi:ss'')' || ',';
            elsif l_descTbl(i).col_type in (180) then
              l_colval := l_colval || 'to_timestamp('''||l_columnValue||''',''yyyy-mm-dd hh24:mi:ssff'')' || ',';
            elsif l_descTbl(i).col_type in (181) then
              l_colval := l_colval ||'to_timestamp_tz('''||l_columnValue||''',''yyyy-mm-dd hh24:mi:ssff tzr'')' || ',';
            elsif l_descTbl(i).col_type in (231) then
              l_colval := l_colval || 'to_timestamp('''||l_columnValue||''',''yyyy-mm-dd hh24:mi:ssff'')' || ',';
            elsif l_descTbl(i).col_type in (182) then
              l_colval := l_colval || 'to_yminterval('''||l_columnValue||''')' || ',';
            elsif l_descTbl(i).col_type in (183) then
              l_colval := l_colval ||'to_dsinterval('''||l_columnValue||''')'  || ',';
            end if;
        end loop;
        l_colval := rtrim(l_colval,',')||')';
        pipe row( l_collist  );
        pipe row( '  values ('||l_colval );
        if mod(n,p_batch) = 0 then
          pipe row('select * from dual;');
        end if;
        l_colval := null;
    end loop;
    if n = 0 then
      pipe row( 'No data found ');
    elsif mod(n,p_batch) != 0 then
      pipe row('select * from dual;');
    end if;
    pipe row('alter session set cursor_sharing = exact;');

    execute immediate 'alter session set nls_date_format='''||l_nls(1)||''' ';
    execute immediate 'alter session set nls_timestamp_format='''||l_nls(2)||''' ';
    execute immediate 'alter session set nls_timestamp_tz_format='''||l_nls(3)||''' ';
    return;
end;
/

and here’s an example of the output you’d expect to see from it


alter session set cursor_sharing = force;
insert all
into scott.emp(EMPNO,ENAME,SAL)
  values (7369,q'{SMITH}',800)
into scott.emp(EMPNO,ENAME,SAL)
  values (7499,q'{ALLEN}',1600)
into scott.emp(EMPNO,ENAME,SAL)
  values (7521,q'{WARD}',1250)
into scott.emp(EMPNO,ENAME,SAL)
  values (7566,q'{JONES}',2975)
into scott.emp(EMPNO,ENAME,SAL)
  values (7654,q'{MARTIN}',1250)
into scott.emp(EMPNO,ENAME,SAL)
  values (7698,q'{BLAKE}',2850)
into scott.emp(EMPNO,ENAME,SAL)
  values (7782,q'{CLARK}',2450)
into scott.emp(EMPNO,ENAME,SAL)
  values (7788,q'{SCOTT}',3000)
into scott.emp(EMPNO,ENAME,SAL)
  values (7839,q'{KING}',5000)
into scott.emp(EMPNO,ENAME,SAL)
  values (7844,q'{TURNER}',1500)
select * from dual;

insert all
into scott.emp(EMPNO,ENAME,SAL)
  values (7876,q'{ADAMS}',1100)
into scott.emp(EMPNO,ENAME,SAL)
  values (7900,q'{JAMES}',950)
into scott.emp(EMPNO,ENAME,SAL)
  values (7902,q'{FORD}',3000)
into scott.emp(EMPNO,ENAME,SAL)
  values (7934,q'{MILLER}',1300)
select * from dual;
alter session set cursor_sharing = exact;

Lets do a performance test on a large sample set. In this test, I’m loading a little over one million rows into an empty table, where the performance harness for each INSERT mechanism looks like the following to output the start and end times for the script:


select systimestamp from dual;
alter session set cursor_sharing = force;
set feedback off

[ Set of INSERT, or set of MULTI-TABLE inserts with 50 rows per INSERT]

set feedback on
alter session set cursor_sharing = exact;
select systimestamp from dual;

Running each gave the following results.


SQL> @c:\tmp\normal_insert.sql

SYSTIMESTAMP
---------------------------------------
17-APR-19 03.39.53.189000 PM +08:00


Session altered.


Session altered.


SYSTIMESTAMP
---------------------------------------
17-APR-19 03.43.47.920000 PM +08:00


SQL> @c:\tmp\multi_insert.sql

SYSTIMESTAMP
---------------------------------------
17-APR-19 03.43.40.402000 PM +08:00


Session altered.


Session altered.


SYSTIMESTAMP
---------------------------------------
17-APR-19 03.44.27.319000 PM +08:00

So that’s a nice four-fold speed boost, down from 3mins 54seconds to less than 50seconds.

Just to reiterate – I’m not suggesting you need to go this level for all INSERT scripts. Most of the time I just add the cursor_sharing options to my scripts, and that makes them plenty fast enough for the job. But if you have got a large number of inserts to do, then converting them to a multi-table equivalent might give you a nice boost.

Here’s the video version

19 responses to “Generating INSERT scripts that run fast!”

Marcus

May 17, 2019 at 4:53 pm

You should set numeric characters, too. Otherwise you get invalid insert statements for countries like Germany that use comma as decimal separator 🙂
EXECUTE IMMEDIATE ‘ALTER SESSION SET NLS_NUMERIC_CHARACTERS = ”.,”’;

Reply
victorhdc

May 18, 2019 at 1:52 am

/*insert*/ hint does nothing in SQL developer 18.4.0.376.

Reply
Connor McDonald

May 18, 2019 at 8:03 am

Run it as a script (F5) not as a standard execution (F9)

Reply
1. victorhdc
  
  May 20, 2019 at 10:25 pm
  
  Thank you
  
  Reply
sthomas

June 12, 2019 at 2:48 am

Hi Connor,

I was trying these on 11gR2 and was getting the “too many rows” error for ~500K rows insert. On the Oracle documentation I see that there is a restriction of 999 columns (In a multitable insert, all of the insert_into_clauses cannot combine to specify more than 999 target columns.)

From your example, you’ve inserted a million rows. Am I missing something?

Reply
1. Connor McDonald
  
  June 12, 2019 at 9:08 am
  
  The key thing here is the “p_batch int default 10” parameter. That is how many rows will be merged into a single insert command. I’m not saying turn 500K rows into ONE insert – turn it into (say) 5000 inserts of 100rows at a time (or similar)
  
  Reply
Developer Live wrap up – Learning is not a spectator sport

August 19, 2020 at 10:58 am

[…] cursor_sharing can also be used to improve DML (insert/update/delete) performance. Here’s an example https://connor-mcdonald.com/2019/05/17/hacking-together-faster-inserts/ […]

Reply
Rommel Gutierrez Roa

October 23, 2020 at 9:16 pm

Hello Connor, this is a nice hack.
¿Can I use this function inside a WITH clause?
The thing is that I don’t have access to write functions, and I need to send a insert script to the DBA of a table that I was working on a TEST environment.
(We are using Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 – 64bit Production)

PS: I tryed but I got an error ORA-06553: PLS-653: aggregate/table functions are not allowed

Reply
1. Connor McDonald
  
  November 6, 2020 at 2:22 pm
  
  From something like that, I’d probably just use SQLcl and hack it, eg
  
  SQL> select /*insert*/ empno, ename from scott.emp;
  
  Insert into SCOTT.EMP (EMPNO,ENAME) values (7369,’SMITH’);
  Insert into SCOTT.EMP (EMPNO,ENAME) values (7499,’ALLEN’);
  Insert into SCOTT.EMP (EMPNO,ENAME) values (7521,’WARD’);
  Insert into SCOTT.EMP (EMPNO,ENAME) values (7566,’JONES’);
  Insert into SCOTT.EMP (EMPNO,ENAME) values (7654,’MARTIN’);
  Insert into SCOTT.EMP (EMPNO,ENAME) values (7698,’BLAKE’);
  Insert into SCOTT.EMP (EMPNO,ENAME) values (7782,’CLARK’);
  Insert into SCOTT.EMP (EMPNO,ENAME) values (7788,’SCOTT’);
  Insert into SCOTT.EMP (EMPNO,ENAME) values (7839,’KING’);
  Insert into SCOTT.EMP (EMPNO,ENAME) values (7844,’TURNER’);
  Insert into SCOTT.EMP (EMPNO,ENAME) values (7876,’ADAMS’);
  Insert into SCOTT.EMP (EMPNO,ENAME) values (7900,’JAMES’);
  Insert into SCOTT.EMP (EMPNO,ENAME) values (7902,’FORD’);
  Insert into SCOTT.EMP (EMPNO,ENAME) values (7934,’MILLER’);
  
  I’d just save into a text file then
  
  a) change all “insert” to nothing
  b) change all “;” to nothing
  
  then every 100 lines or so, just paste in (or do it awk etc) an “insert all”
  
  Reply
Connor McDonald

October 29, 2020 at 10:53 pm

I haven’t tried – an alternative perhaps would be to use an anonymous block and replace the pipelining with dbms_output

Reply
Ottmar

November 8, 2020 at 3:54 am

Hi Connor,

I am testing around with your hack and found out that simply wrapping the insert statements into an anonymous block brings also a big boost similar to your solution:

1000 inserts row by row: around 18 seconds
1000 inserts in an anonymous block: 2.77 seconds
1000 inserts with your hack (batch size 5): 3.74 seconds
1000 inserts with your hack (batch size 10): 2.25 seconds
1000 inserts with your hack (batch size 20): 1.8 seconds
1000 inserts with your hack (batch size 40): 2.77 seconds
1000 inserts with your hack (batch size 80): 3.97 seconds

The sweet spot seems to be around batch size 20 (in my case)

What is Oracle doing in the background inside an anonymous block to speed up the inserts?
Something like your hack?

Thank you for sharing your knowledge and best regards
Ottmar

Reply
1. Matthias Rogel
  
  January 12, 2021 at 12:27 am
  
  Ottmar,
  you mean just add a BEGIN before the first INSERT and an END; after the last INSERT ?
  (that is also my personal approach to speed up insert-scripts, together with cursor_sharing=force).
  It will save you tons of SQL*Net-roundtrips, as easy as that.
  
  Reply
  1. Ottmar Gobrecht
    
    January 13, 2021 at 4:20 am
    
    Matthias,
    
    yes, that is what I meant. I found out that combining all these hacks together is working best on more rows – let say some thousands. On some hundreds, a simple begin/end wrap is normally fast enough.
    
    Reply
  2. ogobrecht
    
    January 13, 2021 at 4:32 am
    
    Matthias,
    
    yes, that is what I meant. I found out that combining all these hacks together is working best on more rows – let say some thousands. On some hundreds, a simple begin/end wrap is normally fast enough.
    
    Reply
How long will the script take to run? – Learning is not a spectator sport

January 11, 2021 at 10:29 pm

[…] already posted previously about options for improving the speed of such scripts to reduce parsing and latency, but no matter […]

Reply
flopflop

May 17, 2021 at 7:50 pm

Hi Connor,
I love this post ! It answered many of my questions…
Unfortunately I’m not able to extract more than about 14.500 rows, I get that error :
ORA-22814 attribute or element value is larger than specified in type

My table has only 4 columns :
NUMBER 9, 0
NUMBER 9, 0
NUMBER 9, 0 nullable
VARCHAR2(3000 CHAR) nullable => all values are null

Do you know how to fix ?

Reply
1. Connor McDonald
  
  June 3, 2021 at 11:58 am
  
  Need to see a complete test case
  
  Reply
Ernesto

May 19, 2022 at 11:59 pm

Hi Connor,

Could you address me on how to find informations on the syntax

q'{ … }’

u used in insert statement for string values ?

Reply
1. Connor McDonald
  
  September 1, 2022 at 7:35 am
  
  https://docs.oracle.com/cd/B19306_01/server.102/b14200/sql_elements003.htm
  
  Reply