Partition loading in direct mode

Posted by

Direct mode insert using the APPEND hint is a cool piece of technology that lets you load bulk data into a table very quickly and efficiently. Obviously there are a number of implications of doing so which you can read about here, but the one that catches most people out is the that you are locking the table during the load and once the load is completed, the table is “disabled” until the transaction ends. Here’s a quick example of that in action:


SQL> create table t
  2  as select * from dba_objects d
  3  where 1=0;

Table created.

SQL> insert /*+ APPEND */ into t
  2  select * from dba_objects d;

82417 rows created.

--
-- No further INSERTs are possible
--
SQL> insert /*+ APPEND */ into t
  2  select * from dba_objects d;
insert /*+ APPEND */ into t
                          *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel


--
-- No further DML at all is possible
--
SQL> delete from t;
delete from t
            *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel


--
-- Not even a SELECT statement is allowed
--
SQL> select count(*) from t;
select count(*) from t
                     *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel


--
-- I must end the transaction first with commit or rollback before normal service is resumed
--
SQL> commit;

Commit complete.

SQL> select count(*) from t;

  COUNT(*)
----------
     82417

This makes sense given that a direct mode insert is manipulating the high water mark (HWM) for a table, so it pretty much has to be an all or nothing process for the session issuing the load, because the HWM is in a state of flux until we commit or rollback.

However, what about a partitioned table? Can I do a direct mode insert into a single partition, whilst leaving the other partitions available in the same session? Let’s try it out.


SQL> create table t
  2  partition by list ( x )
  3  ( partition p1 values (1),
  4    partition p2 values (2)
  5  )
  6  as select 1 x, d.* from dba_objects d
  7  where 1=0;

Table created.

--
-- Only load into partition P1
--
SQL> insert /*+ APPEND */ into t partition (p1)
  2  select 1 x, d.* from dba_objects d;

82419 rows created.

--
-- And now see if partition P2 can be loaded as part of the same transaction
--
SQL> insert /*+ APPEND */ into t partition (p2)
  2  select 2 x, d.* from dba_objects d;
insert /*+ APPEND */ into t partition (p2)
                          *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

Unfortunately not. Even though we never touched partition P2 with the first INSERT, and partition P2 is a different physical segment, we still cannot do additional direct loads on it. One easy workaround to this is to place that second load in a separate transaction, for example:


SQL> declare
  2    pragma autonomous_transaction;
  3  begin
  4    insert /*+ APPEND */ into t partition (p2)
  5    select 2 x, d.* from dba_objects d;
  6    commit;
  7  end;
  8  /

PL/SQL procedure successfully completed.

You’re probably thinking “That’s a bit silly. If we’re going to commit for each partition, then why wouldn’t we just commit after the load of the first partition P1 anyway?”. That’s a valid point, but what the above example shows is that you can do direct path loads into separate partitions concurrently. This opens up opportunities for (dramatically) increasing the throughput of direct path loads if you can segment the source data into its designated target partitions at load time. I’ll extend the table above to have 10 partitions, and use a little DBMS_JOB code to now load 10 partitions all concurrently.


SQL> create table t
  2  partition by list ( x )
  3  ( partition p1 values (1),
  4    partition p2 values (2),
  5    partition p3 values (3),
  6    partition p4 values (4),
  7    partition p5 values (5),
  8    partition p6 values (6),
  9    partition p7 values (7),
 10    partition p8 values (8),
 11    partition p9 values (9),
 12    partition p10 values (10)
 13  )
 14  as select 1 x, d.* from dba_objects d
 15  where 1=0;

Table created.

SQL>
SQL> declare
  2    j int;
  3    l_sql varchar2(200) :=
  4      'begin
  5         insert into t partition (p@)
  6         select @ x, d.* from dba_objects d;
  7         commit;
  8       end;';
  9  begin
 10    for i in 1 .. 10 loop
 11      dbms_job.submit(j,replace(l_sql,'@',i));
 12    end loop;
 13    
 14  end;
 15  /

PL/SQL procedure successfully completed.

SQL>
SQL> select job, what from user_jobs
  2  where what like '%dba_objects%';

       JOB WHAT
---------- --------------------------------------------------
       161 begin
           insert into t partition (p1)
           select 1 x, d.* from dba_objects d;
           commit;
           end;

       162 begin
           insert into t partition (p2)
           select 2 x, d.* from dba_objects d;
           commit;
           end;

       163 begin
           insert into t partition (p3)
           select 3 x, d.* from dba_objects d;
           commit;
           end;

       164 begin
           insert into t partition (p4)
           select 4 x, d.* from dba_objects d;
           commit;
           end;

       165 begin
           insert into t partition (p5)
           select 5 x, d.* from dba_objects d;
           commit;
           end;

       166 begin
           insert into t partition (p6)
           select 6 x, d.* from dba_objects d;
           commit;
           end;

       167 begin
           insert into t partition (p7)
           select 7 x, d.* from dba_objects d;
           commit;
           end;

       168 begin
           insert into t partition (p8)
           select 8 x, d.* from dba_objects d;
           commit;
           end;

       169 begin
           insert into t partition (p9)
           select 9 x, d.* from dba_objects d;
           commit;
           end;

       170 begin
           insert into t partition (p10)
           select 10 x, d.* from dba_objects d;
           commit;
           end;


10 rows selected.

SQL>
SQL> commit;

Commit complete.

SQL> select subobject_name, cnt
  2  from (
  3    select dbms_rowid.rowid_object(rowid) obj, count(*) cnt
  4    from   t
  5    group by dbms_rowid.rowid_object(rowid)
  6    ), user_objects
  7  where obj = data_object_id
  8  order by to_number(substr(subobject_name,2));

SUBOBJECT_        CNT
---------- ----------
P1              82427
P2              82427
P3              82427
P4              82427
P5              82427
P6              82427
P7              82427
P8              82427
P9              82427
P10             82427

And voila! 10 partitions all loaded in direct mode concurrently! There is also a standard means of loading data concurrently using parallel DML, but if I know the segmentation of the data in advance, the “manual” method above can sometimes be a quicker and easier to debug option.

Got some thoughts? Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.