Generic data models … generic applications … ugh

There’s a hesitation to publish this example, because publishing it may be interpreted as an endorsement of this approach and it certainly isn’t. Over the years there have been plenty of articles describing the long term pain that typically comes from generic data models. Here’s a few to whet your appetite.

https://rodgersnotes.wordpress.com/2010/09/21/muck-massively-unified-code-key-generic-three-table-data-model/

https://www.red-gate.com/simple-talk/opinion/opinion-pieces/bad-carma/

https://asktom.oracle.com/pls/asktom/asktom.search?tag=query-on-design

But I’m posting this example because it serves as a nice tutorial for DBMS_SQL, and also, there is perhaps the suggestion that the requestor is moving away from a generic data model to a more well structured one. We’ll go with the benefit of the doubt here Smile

The incoming data for this example was the “classic” generic data model where the table name, column name and column values were not defined in the data dictionary but as values within a table


SQL> CREATE TABLE data_table
  2    (
  3      row_seq      int,
  4      table_name   VARCHAR2(30),
  5      column_name  VARCHAR2(30),
  6      column_value VARCHAR2(30)
  7    );

Table created.

SQL>
SQL>
SQL>
SQL> insert into data_table values ( 1,'TEST_TAB','ORDER_NUMBER','ORD1001');

1 row created.

SQL> insert into data_table values ( 1,'TEST_TAB','CUST_NAME','CUST1001');

1 row created.

SQL> insert into data_table values ( 1,'TEST_TAB','INVOICE_NUMBER','INV001');

1 row created.

SQL> insert into data_table values ( 1,'TEST_TAB','ORDER_DATE','04/11/2018 15:14:00');

1 row created.

SQL> insert into data_table values ( 1,'TEST_TAB','AMOUNT','1001');

1 row created.

SQL>
SQL> insert into data_table values ( 2,'TEST_TAB','ORDER_NUMBER','ORD1002');

1 row created.

SQL> insert into data_table values ( 2,'TEST_TAB','CUST_NAME','CUST1002');

1 row created.

SQL> insert into data_table values ( 2,'TEST_TAB','INVOICE_NUMBER','INV002');

1 row created.

SQL> insert into data_table values ( 2,'TEST_TAB','ORDER_DATE','02/11/2018 15:14:00');

1 row created.

SQL> insert into data_table values ( 2,'TEST_TAB','AMOUNT','1002');

1 row created.

SQL>
SQL> commit;

Commit complete.

SQL> select * from data_table;

   ROW_SEQ TABLE_NAME                     COLUMN_NAME                    COLUMN_VALUE
---------- ------------------------------ ------------------------------ ------------------------------
         1 TEST_TAB                       ORDER_NUMBER                   ORD1001
         1 TEST_TAB                       CUST_NAME                      CUST1001
         1 TEST_TAB                       INVOICE_NUMBER                 INV001
         1 TEST_TAB                       ORDER_DATE                     04/11/2018 15:14:00
         1 TEST_TAB                       AMOUNT                         1001
         2 TEST_TAB                       ORDER_NUMBER                   ORD1002
         2 TEST_TAB                       CUST_NAME                      CUST1002
         2 TEST_TAB                       INVOICE_NUMBER                 INV002
         2 TEST_TAB                       ORDER_DATE                     02/11/2018 15:14:00
         2 TEST_TAB                       AMOUNT                         1002

10 rows selected.

The task here was to take those values and convert into INSERT statements, so with the sample data above, the aim is to insert those rows into a table called TEST_TAB. Given that the DML must generated entirely from metadata, we can use DBMS_SQL to handle it. Constructing the DML is easier than you might think due to some handy analytic SQL functions plus the ever useful LISTAGG.


SQL> select
  2    row_number() over
  3      ( partition by table_name, row_seq order by column_name ) as seq,
  4    count(*) over
  5      ( partition by table_name, row_seq ) as col_cnt,
  6    listagg(column_name,',') within group
  7      ( order by column_name ) over ( partition by table_name, row_seq ) as cols,
  8    listagg(':'||column_name,',') within group
  9      ( order by column_name ) over ( partition by table_name, row_seq ) as bindcols,
 10    column_value
 11  from data_table
 12  order by table_name, row_seq, column_name
 13  @pr
==============================
SEQ                           : 1
COL_CNT                       : 5
COLS                          : AMOUNT,CUST_NAME,INVOICE_NUMBER,ORDER_DATE,ORDER_NUMBER
BINDCOLS                      : :AMOUNT,:CUST_NAME,:INVOICE_NUMBER,:ORDER_DATE,:ORDER_NUMBER
COLUMN_VALUE                  : 1001
==============================
SEQ                           : 2
COL_CNT                       : 5
COLS                          : AMOUNT,CUST_NAME,INVOICE_NUMBER,ORDER_DATE,ORDER_NUMBER
BINDCOLS                      : :AMOUNT,:CUST_NAME,:INVOICE_NUMBER,:ORDER_DATE,:ORDER_NUMBER
COLUMN_VALUE                  : CUST1001
==============================
SEQ                           : 3
COL_CNT                       : 5
COLS                          : AMOUNT,CUST_NAME,INVOICE_NUMBER,ORDER_DATE,ORDER_NUMBER
BINDCOLS                      : :AMOUNT,:CUST_NAME,:INVOICE_NUMBER,:ORDER_DATE,:ORDER_NUMBER
COLUMN_VALUE                  : INV001
==============================
SEQ                           : 4
COL_CNT                       : 5
COLS                          : AMOUNT,CUST_NAME,INVOICE_NUMBER,ORDER_DATE,ORDER_NUMBER
BINDCOLS                      : :AMOUNT,:CUST_NAME,:INVOICE_NUMBER,:ORDER_DATE,:ORDER_NUMBER
COLUMN_VALUE                  : 04/11/2018 15:14:00
==============================
SEQ                           : 5
COL_CNT                       : 5
COLS                          : AMOUNT,CUST_NAME,INVOICE_NUMBER,ORDER_DATE,ORDER_NUMBER
BINDCOLS                      : :AMOUNT,:CUST_NAME,:INVOICE_NUMBER,:ORDER_DATE,:ORDER_NUMBER
COLUMN_VALUE                  : ORD1001
==============================
SEQ                           : 1
COL_CNT                       : 5
COLS                          : AMOUNT,CUST_NAME,INVOICE_NUMBER,ORDER_DATE,ORDER_NUMBER
BINDCOLS                      : :AMOUNT,:CUST_NAME,:INVOICE_NUMBER,:ORDER_DATE,:ORDER_NUMBER
COLUMN_VALUE                  : 1002
==============================
SEQ                           : 2
COL_CNT                       : 5
COLS                          : AMOUNT,CUST_NAME,INVOICE_NUMBER,ORDER_DATE,ORDER_NUMBER
BINDCOLS                      : :AMOUNT,:CUST_NAME,:INVOICE_NUMBER,:ORDER_DATE,:ORDER_NUMBER
COLUMN_VALUE                  : CUST1002
==============================
SEQ                           : 3
COL_CNT                       : 5
COLS                          : AMOUNT,CUST_NAME,INVOICE_NUMBER,ORDER_DATE,ORDER_NUMBER
BINDCOLS                      : :AMOUNT,:CUST_NAME,:INVOICE_NUMBER,:ORDER_DATE,:ORDER_NUMBER
COLUMN_VALUE                  : INV002
==============================
SEQ                           : 4
COL_CNT                       : 5
COLS                          : AMOUNT,CUST_NAME,INVOICE_NUMBER,ORDER_DATE,ORDER_NUMBER
BINDCOLS                      : :AMOUNT,:CUST_NAME,:INVOICE_NUMBER,:ORDER_DATE,:ORDER_NUMBER
COLUMN_VALUE                  : 02/11/2018 15:14:00
==============================
SEQ                           : 5
COL_CNT                       : 5
COLS                          : AMOUNT,CUST_NAME,INVOICE_NUMBER,ORDER_DATE,ORDER_NUMBER
BINDCOLS                      : :AMOUNT,:CUST_NAME,:INVOICE_NUMBER,:ORDER_DATE,:ORDER_NUMBER
COLUMN_VALUE                  : ORD1002

Now that we have each data value, plus all the components for an INSERT statement (including binding values not using any literals), we’re good to go:


SQL> create table test_tab (
  2    order_number varchar2(10),
  3    cust_name varchar2(10),
  4    invoice_number varchar2(10),
  5    order_date varchar2(30),
  6    amount varchar2(10)
  7  );

Table created.


SQL> declare
  2    l_sql varchar2(32000);
  3    l_cur     pls_integer := dbms_sql.open_cursor;
  4    l_execute pls_integer;
  5  begin
  6    for i in (
  7      select   table_name,
  8               column_name,
  9               row_number() over ( partition by table_name, row_seq order by column_name ) as seq,
 10               count(*) over ( partition by table_name, row_seq ) as col_cnt,
 11               listagg(column_name,',') within group ( order by column_name ) over ( partition by table_name, row_seq ) as cols,
 12               listagg(':'||column_name,',') within group ( order by column_name ) over ( partition by table_name, row_seq ) as bindcols,
 13               column_value
 14      from data_table
 15      order by table_name, row_seq, column_name
 16   ) loop
 17       if i.seq = 1 then
 18         l_sql := 'insert into '||i.table_name||'('||i.cols||') values ('||i.bindcols||')';
 19         dbms_sql.parse(l_cur,l_sql,dbms_sql.native);
 20       end if;
 21       dbms_sql.bind_variable(l_cur,i.column_name,i.column_value);
 22       if i.seq = i.col_cnt then
 23         l_execute := dbms_sql.execute(l_cur);
 24       end if;
 25   end loop;
 26   dbms_sql.close_cursor(l_cur);
 27  end;
 28  /

PL/SQL procedure successfully completed.

SQL>
SQL> select * from test_tab;

ORDER_NUMB CUST_NAME  INVOICE_NU ORDER_DATE                     AMOUNT
---------- ---------- ---------- ------------------------------ ----------
ORD1001    CUST1001   INV001     04/11/2018 15:14:00            1001
ORD1002    CUST1002   INV002     02/11/2018 15:14:00            1002

2 rows selected.

SQL>

DBMS_JOB – the joy of transactions

This is a followup to yesterdays post on DBMS_JOB and is critical if you’re upgrading to 19c soon. Mike Dietrich wrote a nice piece last week on the “under the covers” migration of the now deprecated DBMS_JOB package to the new Scheduler architecture. You should check it out before reading on here.

Mike’s post concerned mainly what would happen during upgrade (spoiler: the DBMS_JOB jobs become scheduler jobs but you can maintain them using the old API without drama), but immediately Twitter was a buzz with a couple of concerns that I wanted to address:

1) What about new jobs submitted via the old API after the upgrade?

For comparison, here’s what you see currently in 18c – DBMS_JOB is quite separate from the scheduler.


SQL> select banner from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production

SQL> declare
  2    j int;
  3  begin
  4    dbms_job.submit(j,'begin dbms_session.sleep(60); end;',sysdate,'sysdate+1');
  5  end;
  6  /

PL/SQL procedure successfully completed.

SQL> select job, what from user_jobs;

       JOB WHAT
---------- --------------------------------------------------
       181 begin dbms_session.sleep(60); end;

SQL> select job_name, job_action from user_scheduler_jobs;

no rows selected


Now here’s the same in 19c


SQL> select banner from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production

SQL> declare
  2    j int;
  3  begin
  4    dbms_job.submit(j,'begin dbms_session.sleep(60); end;',sysdate,'sysdate+1');
  5  end;
  6  /

PL/SQL procedure successfully completed.

SQL> select job, what from user_jobs;

       JOB WHAT
---------- --------------------------------------------------
        22 begin dbms_session.sleep(60); end;

SQL> select job_name, job_action from user_scheduler_jobs;

JOB_NAME             JOB_ACTION
-------------------- ------------------------------------------------------------
DBMS_JOB$_22         begin dbms_session.sleep(60); end;

SQL> select * from user_scheduler_jobs
  2  @pr
==============================
JOB_NAME                      : DBMS_JOB$_22
JOB_SUBNAME                   :
JOB_STYLE                     : REGULAR
JOB_CREATOR                   : MCDONAC
CLIENT_ID                     :
GLOBAL_UID                    :
PROGRAM_OWNER                 :
PROGRAM_NAME                  :
JOB_TYPE                      : PLSQL_BLOCK
JOB_ACTION                    : begin dbms_session.sleep(60); end;
NUMBER_OF_ARGUMENTS           : 0
SCHEDULE_OWNER                :
SCHEDULE_NAME                 :
SCHEDULE_TYPE                 : PLSQL
START_DATE                    : 26-MAY-19 07.12.47.000000 PM +08:00
REPEAT_INTERVAL               : sysdate+1
EVENT_QUEUE_OWNER             :
EVENT_QUEUE_NAME              :
EVENT_QUEUE_AGENT             :
EVENT_CONDITION               :
EVENT_RULE                    :
FILE_WATCHER_OWNER            :
FILE_WATCHER_NAME             :
END_DATE                      :
JOB_CLASS                     : DEFAULT_JOB_CLASS
ENABLED                       : TRUE
AUTO_DROP                     : TRUE
RESTART_ON_RECOVERY           : FALSE
RESTART_ON_FAILURE            : FALSE
STATE                         : SCHEDULED
JOB_PRIORITY                  : 3
RUN_COUNT                     : 0
UPTIME_RUN_COUNT              :
MAX_RUNS                      :
FAILURE_COUNT                 : 0
UPTIME_FAILURE_COUNT          :
MAX_FAILURES                  :
RETRY_COUNT                   : 0
LAST_START_DATE               :
LAST_RUN_DURATION             :
NEXT_RUN_DATE                 : 27-MAY-19 07.12.47.000000 PM -07:00
SCHEDULE_LIMIT                :
MAX_RUN_DURATION              :
LOGGING_LEVEL                 : OFF
STORE_OUTPUT                  : TRUE
STOP_ON_WINDOW_CLOSE          : FALSE
INSTANCE_STICKINESS           : TRUE
RAISE_EVENTS                  :
SYSTEM                        : FALSE
JOB_WEIGHT                    : 1
NLS_ENV                       : NLS_LANGUAGE='AMERICAN' NLS_TERRITORY='AMERICA'...
SOURCE                        :
NUMBER_OF_DESTINATIONS        : 1
DESTINATION_OWNER             :
DESTINATION                   :
CREDENTIAL_OWNER              :
CREDENTIAL_NAME               :
INSTANCE_ID                   :
DEFERRED_DROP                 : FALSE
ALLOW_RUNS_IN_RESTRICTED_MODE : FALSE
COMMENTS                      :
FLAGS                         : 9007199254872624
RESTARTABLE                   : FALSE
HAS_CONSTRAINTS               : FALSE
CONNECT_CREDENTIAL_OWNER      :
CONNECT_CREDENTIAL_NAME       :
FAIL_ON_SCRIPT_ERROR          : FALSE

PL/SQL procedure successfully completed.

You can see that it will be enabled by default and is classed as a regular job. Even if you submit a one-off job, it will still be classed as regular not lightweight.


SQL> declare
  2    j int;
  3  begin
  4    dbms_job.submit(j,'begin dbms_session.sleep(60); end;',sysdate);
  5  end;
  6  /

PL/SQL procedure successfully completed.

SQL> select job, what from user_jobs;

       JOB WHAT
---------- --------------------------------------------------
        25 begin dbms_session.sleep(60); end;

SQL> select * from user_scheduler_jobs
  2  @pr
==============================
JOB_NAME                      : DBMS_JOB$_25
JOB_SUBNAME                   :
JOB_STYLE                     : REGULAR
JOB_CREATOR                   : MCDONAC
CLIENT_ID                     :
GLOBAL_UID                    :
PROGRAM_OWNER                 :
PROGRAM_NAME                  :
JOB_TYPE                      : PLSQL_BLOCK
JOB_ACTION                    : begin dbms_session.sleep(60); end;
NUMBER_OF_ARGUMENTS           : 0
SCHEDULE_OWNER                :
SCHEDULE_NAME                 :
SCHEDULE_TYPE                 : ONCE
START_DATE                    : 26-MAY-19 08.37.09.000000 PM +08:00
REPEAT_INTERVAL               :
EVENT_QUEUE_OWNER             :
EVENT_QUEUE_NAME              :
EVENT_QUEUE_AGENT             :
EVENT_CONDITION               :
EVENT_RULE                    :
FILE_WATCHER_OWNER            :
FILE_WATCHER_NAME             :
END_DATE                      :
JOB_CLASS                     : DEFAULT_JOB_CLASS
ENABLED                       : TRUE
AUTO_DROP                     : TRUE
RESTART_ON_RECOVERY           : FALSE
RESTART_ON_FAILURE            : FALSE
STATE                         : SCHEDULED
JOB_PRIORITY                  : 3
RUN_COUNT                     : 0
UPTIME_RUN_COUNT              :
MAX_RUNS                      :
FAILURE_COUNT                 : 0
UPTIME_FAILURE_COUNT          :
MAX_FAILURES                  :
RETRY_COUNT                   : 0
LAST_START_DATE               :
LAST_RUN_DURATION             :
NEXT_RUN_DATE                 : 27-MAY-19 11.37.09.268652 AM +08:00
SCHEDULE_LIMIT                :
MAX_RUN_DURATION              :
LOGGING_LEVEL                 : OFF
STORE_OUTPUT                  : TRUE
STOP_ON_WINDOW_CLOSE          : FALSE
INSTANCE_STICKINESS           : TRUE
RAISE_EVENTS                  :
SYSTEM                        : FALSE
JOB_WEIGHT                    : 1
NLS_ENV                       : NLS_LANGUAGE='AMERICAN' NLS_TERRITORY='AMERICA' NLS_CURRENC
NLS_NUMERIC_CHARACTERS='.,' NLS_CALENDAR='GREGORIAN' NLS_DATE_FORMAT='DD-MON-RR' NLS_DATE_L
NLS_TIME_FORMAT='HH.MI.SSXFF AM' NLS_TIMESTAMP_FORMAT='DD-MON-RR HH.MI.SSXFF AM' NLS_TIME_T
NLS_TIMESTAMP_TZ_FORMAT='DD-MON-RR HH.MI.SSXFF AM TZR' NLS_DUAL_CURRENCY='$' NLS_COMP='BINA
NLS_NCHAR_CONV_EXCP='FALSE'
SOURCE                        :
NUMBER_OF_DESTINATIONS        : 1
DESTINATION_OWNER             :
DESTINATION                   :
CREDENTIAL_OWNER              :
CREDENTIAL_NAME               :
INSTANCE_ID                   :
DEFERRED_DROP                 : FALSE
ALLOW_RUNS_IN_RESTRICTED_MODE : FALSE
COMMENTS                      :
FLAGS                         : 9007199254880304
RESTARTABLE                   : FALSE
HAS_CONSTRAINTS               : FALSE
CONNECT_CREDENTIAL_OWNER      :
CONNECT_CREDENTIAL_NAME       :
FAIL_ON_SCRIPT_ERROR          : FALSE

PL/SQL procedure successfully completed.

Important Note: There is one critical thing you need to be aware of with this change. DBMS_JOB is an “old-school” public API, hence anyone and everyone pretty much had access to it. Anyone familiar with DBMS_SCHEDULER will know that the components within it are true database objects, which can be protected with privileges. So when you upgrade to 19c, to ensure that you do not get nasty surprises, users that are using DBMS_JOB will need the CREATE JOB privilege otherwise their previous ability to submit jobs will disappear. For example:


SQL> conn scott/tiger@db192_pdb1
Connected.

SQL> declare
  2    j int;
  3  begin
  4    dbms_job.submit(j,'begin dbms_session.sleep(60); end;',sysdate);
  5  end;
  6  /
declare
*
ERROR at line 1:
ORA-27486: insufficient privileges
ORA-06512: at "SYS.DBMS_ISCHED", line 9387
ORA-06512: at "SYS.DBMS_ISCHED", line 9376
ORA-06512: at "SYS.DBMS_ISCHED", line 175
ORA-06512: at "SYS.DBMS_ISCHED", line 9302
ORA-06512: at "SYS.DBMS_IJOB", line 196
ORA-06512: at "SYS.DBMS_JOB", line 168
ORA-06512: at line 4

2) What about the cherished transactional nature of DBMS_JOB?

If the old style jobs are now scheduler jobs, do we lose the transactional element of DBMS_JOB? Nope. Even though we will create a paired scheduler entry, DBMS_JOB is still transactional (which I love!).


SQL> declare
  2    j int;
  3  begin
  4    dbms_job.submit(j,'begin dbms_session.sleep(60); end;',sysdate,'sysdate+1');
  5  end;
  6  /

PL/SQL procedure successfully completed.

SQL> select job, what from user_jobs;

       JOB WHAT
---------- --------------------------------------------------
        21 begin dbms_session.sleep(60); end;

SQL> select job_name, job_action from user_scheduler_jobs;

JOB_NAME             JOB_ACTION
-------------------- ------------------------------------------------------------
DBMS_JOB$_21         begin dbms_session.sleep(60); end;

SQL> roll;
Rollback complete.
SQL> select job, what from user_jobs;

no rows selected

SQL> select job_name, job_action from user_scheduler_jobs;

no rows selected

SQL>

If you love that transactional capability too, then make sure to visit the database ideas page to vote up this idea. I have no issue with DBMS_SCHEDULER doing commits by default, but it would be cool if (say) for lightweight jobs we had an option to choose whether we commit or not.

DBMS_JOB – watching for failures

I had a friend point this one out to me recently. They use DBMS_JOB to allow some “fire and forget” style functionality for user, and in their case, the jobs are “best efforts” in that if they fail, it is not a big deal.

So whilst this may sound counter-intuitive, but if you rely on jobs submitted via DBMS_JOB to fail, then please read on.

By default, if a job fails 16 times in a row, it will marked as broken by the the database. Here’s a simple of example that, where the anonymous block will obviously fail because we are dividing by zero each time. I’ll set the job to run every 5 seconds, so that within a couple of minutes we’ll have 16 failures. First I’ll run this on 11.2.0.4


SQL> select banner from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.4 - 64bit Production

SQL> declare
  2    j int;
  3  begin
  4    dbms_job.submit(j,'declare x int; begin x := 1/0; end;',sysdate,'sysdate+5/86400');
  5  end;
  6  /

PL/SQL procedure successfully completed.

SQL> commit;

Commit complete.

[wait for a bit]

SQL> select job, what, failures from user_jobs;

       JOB WHAT                                                 FAILURES
---------- -------------------------------------------------- ----------
        23 declare x int; begin x := 1/0; end;                         8

[wait for a bit]

SQL> /

       JOB WHAT                                                 FAILURES B
---------- -------------------------------------------------- ---------- -
         3 declare x int; begin x := 1/0; end;                         9 N

SQL> /

       JOB WHAT                                                 FAILURES B
---------- -------------------------------------------------- ---------- -
         3 declare x int; begin x := 1/0; end;                        16 Y

SQL> /

       JOB WHAT                                                 FAILURES B
---------- -------------------------------------------------- ---------- -
         3 declare x int; begin x := 1/0; end;                        16 Y

You can see the “breaking” of the job in action here. We got to 16 failures, and the database decided “enough is enough” Smile and the job will no longer run until some sort of intervention is performed by an administrator.

Now I’ll run that demo again in 12.2


SQL> declare
  2    j int;
  3  begin
  4    dbms_job.submit(j,'declare x int; begin x := 1/0; end;',sysdate,'sysdate+5/86400');
  5  end;
  6  /

PL/SQL procedure successfully completed.

SQL> commit;

Commit complete.

[wait for a bit]

SQL> select job, what, failures from user_jobs;

       JOB WHAT                                                 FAILURES
---------- -------------------------------------------------- ----------
        23 declare x int; begin x := 1/0; end;                         8

[wait for a bit]

SQL> /

       JOB WHAT                                                 FAILURES B
---------- -------------------------------------------------- ---------- -
         3 declare x int; begin x := 1/0; end;                        13 N

SQL> /

       JOB WHAT                                                 FAILURES B
---------- -------------------------------------------------- ---------- -
         3 declare x int; begin x := 1/0; end;                        19 N

SQL> /

       JOB WHAT                                                 FAILURES B
---------- -------------------------------------------------- ---------- -
         3 declare x int; begin x := 1/0; end;                        25 N

You can see that in 12.2 (and I’ve tested in 18 and 19) that failures can continue past 16 unabated. If you’re being hit by this, patches are available, so please get in touch with Support. A quick workaround until you can apply patches is to use another job to monitor the others. Here’s a small anonymous block you could run in each container as SYSDBA that you could schedule (say) every few minutes


SQL> begin
  2    for i in ( select job
  3               from dba_jobs
  4               where  failures > 16
  5               and    job not in ( sys_context('userenv','fg_job_id'),sys_context('userenv','bg_job_id'))
  6               and    job not in ( select job from dba_jobs_running )
  7               )
  8    loop
  9      dbms_ijob.broken(i.job,true);
 10    end loop;
 11    commit;
 12  end;
 13  /

PL/SQL procedure successfully completed.

Ideally, you probably want to be moving to DBMS_SCHEDULER where possible, which has a greater flexibility and control over error handling amongst many other things.

Being generous to the optimizer

In a perfect world, the optimizer would reach out from the server room, say to us: “Hey, lets grab a coffee and have a chat about that query of yours”. Because ultimately, that is the task we are bestowing on the optimizer – to know what our intent was in terms of running a query in a way that meets the performance needs of our applications. It generally does a pretty good job even without the coffee Smile, but if we can keep that caffeine hit in mind, we can do our bit as SQL developers to give the optimizer as much assistance as we can.

Here’s such an example. Let’s assume users of your application can perform searches for street addresses. They nominate what kind of search they want to do (street name or suburb), and then provide a value to search on.

image

Here’s some base tables to support the application.


SQL> create table address ( street int, suburb int, data char(100));

Table created.

SQL> insert into address
  2  select mod(rownum,1000), mod(rownum,10), rownum from dual connect by level 
SQL> commit;

Commit complete.

SQL>
SQL> exec dbms_stats.gather_table_stats('','ADDRESS');

PL/SQL procedure successfully completed.

SQL> create index ix1 on address ( street );

Index created.

SQL> create index ix2 on address ( suburb );

Index created.

To run that search from the application, probably the most straightforward SQL that handles the requirement is:


SQL> select data
  2  from   address
  3  where ( :choice = 1 and street = :val )
  4  or    ( :choice = 2 and suburb = :val );

on the assumption here that the application sends “1” for street search type, and “2” for a suburb search.

Let’s give that a run for a nominated street and see what the optimizer makes of this SQL.


SQL> variable choice number = 1
SQL> variable val number = 6
SQL> set feedback only
SQL> select data
  2  from   address
  3  where ( :choice = 1 and street = :val )
  4  or    ( :choice = 2 and suburb = :val );

100 rows selected.

SQL> set feedback on
SQL> select * from dbms_xplan.display_cursor();

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------
SQL_ID  cwwcr79bfx5zz, child number 0
-------------------------------------
select data from   address where ( :choice = 1 and street = :val ) or
 ( :choice = 2 and suburb = :val )

Plan hash value: 3645838471

-----------------------------------------------------------------------------
| Id  | Operation         | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |         |       |       |   445 (100)|          |
|*  1 |  TABLE ACCESS FULL| ADDRESS |   101 | 10908 |   445   (1)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(((:CHOICE=2 AND "SUBURB"=:VAL) OR ("STREET"=:VAL AND
              :CHOICE=1)))

Notice that even though we have indexes on both STREET and SUBURB it opted for a full table scan. The driving metrics behind this decision is that the selectivity on the SUBURB index is very poor (only 10 distinct values). Of course, this means that users who perform searches on STREET are also going to be punished with this table scan.

In an ideal world, the optimizer would be clever enough to dive into the SQL, notice that the same bind variable is being used in each predicate, and that the values for that bind variable are mutually exclusive, and thus know that only one of the OR conditions can ever be true and optimize the query with that knowledge. But as I mentioned, we never get that chance to “grab a coffee” with the optimizer.

However, as developers, we can assist by restructuring our query to give that little bit more information to the optimizer. I’ve rewritten the query as as UNION ALL query to separate both predicates into their own SQL.


SQL> set feedback only
SQL> select data
  2  from   address
  3  where ( :choice = 1 and street = :val )
  4  union all
  5  select data
  6  from   address
  7  where    ( :choice = 2 and suburb = :val );

100 rows selected.

SQL> set feedback on
SQL>
SQL> select * from dbms_xplan.display_cursor();

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------
SQL_ID  3x1nc068ntchg, child number 0
-------------------------------------
select data from   address where ( :choice = 1 and street = :val )
union all select data from   address where    ( :choice = 2 and suburb
= :val )

Plan hash value: 1995695946

-------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |         |       |       |   545 (100)|          |
|   1 |  UNION-ALL                            |         |       |       |            |          |
|*  2 |   FILTER                              |         |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| ADDRESS |   100 | 10500 |   101   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN                  | IX1     |   100 |       |     1   (0)| 00:00:01 |
|*  5 |   FILTER                              |         |       |       |            |          |
|*  6 |    TABLE ACCESS FULL                  | ADDRESS | 10000 |  1015K|   444   (1)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(:CHOICE=1)
   4 - access("STREET"=:VAL)
   5 - filter(:CHOICE=2)
   6 - filter("SUBURB"=:VAL)

After running that SQL, the plan might look worse, but the key elements here are line 2 and 5. Notice that these are FILTER steps which means that if they evaluate to false, then the underlying steps will not be run. In effect, the optimizer now knows that it will only need to run one “half” of the UNION ALL based on the outcome of the two FILTER steps.

That all might sound good in theory but we need to prove this hypothesis. I’ll run the first query and check the statistics:


SQL> set autotrace traceonly stat
SQL> select data
  2  from   address
  3  where ( :choice = 1 and street = :val )
  4  or    ( :choice = 2 and suburb = :val );

100 rows selected.


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1578  consistent gets
          0  physical reads
          0  redo size
      11184  bytes sent via SQL*Net to client
        624  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

and now compare that to the second query.


SQL> select data
  2  from   address
  3  where ( :choice = 1 and street = :val )
  4  union all
  5  select data
  6  from   address
  7  where    ( :choice = 2 and suburb = :val );

100 rows selected.


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        103  consistent gets
          0  physical reads
          0  redo size
      11184  bytes sent via SQL*Net to client
        624  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

Nice stuff. Note that this does not stop the full table scan when searching for SUBURB, but what has been achieved that we are getting much better search performance for STREET lookups.

Using the GATHER_PLAN_STATISTICS hint, it is also possible to see the benefit of the FILTER steps by looking at the Starts/A-Rows columns.


SQL> select /*+ gather_plan_statistics */ data
  2  from   address
  3  where ( :choice = 1 and street = :val )
  4  union all
  5  select data
  6  from   address
  7  where    ( :choice = 2 and suburb = :val );

100 rows selected.

SQL> set lines 200
SQL> set feedback on
SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------
SQL_ID  ch0h5t76r7d2m, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ data from   address where (
:choice = 1 and street = :val ) union all select data from   address
where    ( :choice = 2 and suburb = :val )

Plan hash value: 1995695946

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |         |      1 |        |    100 |00:00:00.01 |     103 |
|   1 |  UNION-ALL                            |         |      1 |        |    100 |00:00:00.01 |     103 |
|*  2 |   FILTER                              |         |      1 |        |    100 |00:00:00.01 |     103 |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| ADDRESS |      1 |    100 |    100 |00:00:00.01 |     103 |
|*  4 |     INDEX RANGE SCAN                  | IX1     |      1 |    100 |    100 |00:00:00.01 |       3 |
|*  5 |   FILTER                              |         |      1 |        |      0 |00:00:00.01 |       0 |
|*  6 |    TABLE ACCESS FULL                  | ADDRESS |      0 |  10000 |      0 |00:00:00.01 |       0 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(:CHOICE=1)
   4 - access("STREET"=:VAL)
   5 - filter(:CHOICE=2)
   6 - filter("SUBURB"=:VAL)

Notice that we never commenced (started) the full scan for the SUBURB lookup, because the overarching FILTER step returned nothing.

So when you discover that you know something about a query that the optimizer does not, look for ways pass that information onto the optimizer so that it (and your users) benefit from the knowledge you have as a developer of your applications.

Hacking together faster INSERTs

Most developers tools out there have some mechanism to unload a table into a flat file, either as CSV, or Excel, and some even allow you to unload the data as INSERT statements. The latter is pretty cool because it’s a nice way of having a self-contained file that does not need Excel or DataPump or any tool additional to the one you’re probably using to unload the data.

SQLcl and SQL Developer are perhaps the easiest to utilize for such an extract. You simply add the pseudo-hint INSERT to get the output as insert statements. For example:


SQLcl: Release 18.4 Production on Wed Apr 17 17:05:49 2019

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Last Successful login time: Wed Apr 17 2019 17:05:49 +08:00

Connected to:
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Version 18.6.0.0.0

SQL> select /*insert*/ * from t where rownum <= 10;
REM INSERTING into T
SET DEFINE OFF;
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_FILE#_BLOCK#','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_OBJ3','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_TS1','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_CON1','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','IND$','TABLE');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','CDEF$','TABLE');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','C_TS#','CLUSTER');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_CCOL2','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_PROXY_DATA$','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_CDEF4','INDEX');
...
...

That’s pretty cool but no matter what tool you use, ultimately you end up with a set of insert statements that will load 1 row at a time into the database. For a small number of rows, that is no problem. But for larger numbers of rows then you might sitting around for a long time waiting for that script to finish. There is two things that are going to slow us down here.

1) Parsing

Every insert is a brand new statement since it contains literal values. Even for a simple INSERT, that’s a lot of parsing work for the database. This is easily solved with a couple of ALTER SESSION wrappers at the head and tail of the script.


alter session set cursor_sharing = force;
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_FILE#_BLOCK#','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_OBJ3','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_TS1','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_CON1','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','IND$','TABLE');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','CDEF$','TABLE');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','C_TS#','CLUSTER');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_CCOL2','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_PROXY_DATA$','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_CDEF4','INDEX');
...
...
alter session set cursor_sharing = exact;

That will replace the literals with bind variable values and hence all of the INSERTs will be sharable.

2) Row by Row

A set of INSERT statements is inherently a row by row operation. One INSERT = one row. Of course, we developers have all had drummed into our programming minds over the years that database operations should be done in sets not row by row. But how can we get to that utopia with set of insert statements? We could totally rewrite the script with bind arrays and the like, but that’s a big job. So here is my quick hack to get convert single row inserts into multiple row inserts without too much effort.

A set of individual INSERTS such as


Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_FILE#_BLOCK#','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_OBJ3','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_TS1','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','I_CON1','INDEX');
Insert into T (OWNER,OBJECT_NAME,OBJECT_TYPE) values ('SYS','IND$','TABLE');

can also be written as a multi-table INSERT, where all insert targets are the same table and all WHEN conditions are omitted and hence are implicitly true. The above set of INSERTs can be phrased as a single statement like this:


insert all
into t(OWNER,OBJECT_NAME,OBJECT_TYPE)  values (q'{SYS}',q'{I_FILE#_BLOCK#}',q'{INDEX}')
into t(OWNER,OBJECT_NAME,OBJECT_TYPE)  values (q'{SYS}',q'{I_OBJ3}',q'{INDEX}')
into t(OWNER,OBJECT_NAME,OBJECT_TYPE)  values (q'{SYS}',q'{I_TS1}',q'{INDEX}')
into t(OWNER,OBJECT_NAME,OBJECT_TYPE)  values (q'{SYS}',q'{I_CON1}',q'{INDEX}')
into t(OWNER,OBJECT_NAME,OBJECT_TYPE)  values (q'{SYS}',q'{IND$}',q'{TABLE}')
select * from dual;

You could write some awk/sed etc to take an existing INSERT script and make it into a multiple table one. I took a slightly different approach and whipped up some PL/SQL to allow pipelining the appropriate multi-table inserts out to a spool file. Here is the function I wrote to do it.


create or replace function as_insert(p_query varchar2, p_batch int default 10) return sys.odcivarchar2list pipelined as
    l_theCursor     integer default dbms_sql.open_cursor;
    l_columnValue   varchar2(4000);
    l_status        integer;
    l_descTbl       dbms_sql.desc_tab;
    l_colCnt        number;
    n number := 0;
    
    l_tname varchar2(200) := substr(p_query,instr(p_query,' ',-1,1)+1);
    l_collist varchar2(32000);
    l_colval varchar2(32000);
    l_dml varchar2(32000);
    
    l_nls sys.odcivarchar2list := sys.odcivarchar2list();
    
begin
   if l_tname is null then l_tname := '@@TABLE@@'; end if;

   select value
   bulk collect into l_nls
   from v$nls_parameters
   where parameter in (   
      'NLS_DATE_FORMAT',
      'NLS_TIMESTAMP_FORMAT',
      'NLS_TIMESTAMP_TZ_FORMAT')
   order by parameter;

    execute immediate 'alter session set nls_date_format=''yyyy-mm-dd hh24:mi:ss'' ';
    execute immediate 'alter session set nls_timestamp_format=''yyyy-mm-dd hh24:mi:ssff'' ';
    execute immediate 'alter session set nls_timestamp_tz_format=''yyyy-mm-dd hh24:mi:ssff tzr'' ';

    dbms_sql.parse(  l_theCursor,  p_query, dbms_sql.native );
    dbms_sql.describe_columns( l_theCursor, l_colCnt, l_descTbl );

    for i in 1 .. l_colCnt loop
        dbms_sql.define_column(l_theCursor, i, l_columnValue, 4000);
        l_collist := l_collist || l_descTbl(i).col_name||',';
    end loop;
    l_collist := 'into '||l_tname||'('||rtrim(l_collist,',')||')';

    l_status := dbms_sql.execute(l_theCursor);

    pipe row('alter session set cursor_sharing = force;');
    while ( dbms_sql.fetch_rows(l_theCursor) > 0 ) loop
       n := n + 1;
  
       if mod(n,p_batch) = 1 then
          pipe row('insert all ');
       end if;
       
        for i in 1 .. l_colCnt loop
            dbms_sql.column_value( l_theCursor, i, l_columnValue );
            if l_columnValue is null then
              l_colval := l_colval || 'null,';
            elsif l_descTbl(i).col_type in (1,8,9,96,112) then
              l_colval := l_colval || 'q''{'||l_columnValue ||'}''' || ',';
            elsif l_descTbl(i).col_type in (2,100,101) then
              l_colval := l_colval || l_columnValue || ',';
            elsif l_descTbl(i).col_type in (12) then
              l_colval := l_colval || 'to_date('''||l_columnValue||''',''yyyy-mm-dd hh24:mi:ss'')' || ',';
            elsif l_descTbl(i).col_type in (180) then
              l_colval := l_colval || 'to_timestamp('''||l_columnValue||''',''yyyy-mm-dd hh24:mi:ssff'')' || ',';
            elsif l_descTbl(i).col_type in (181) then
              l_colval := l_colval ||'to_timestamp_tz('''||l_columnValue||''',''yyyy-mm-dd hh24:mi:ssff tzr'')' || ',';
            elsif l_descTbl(i).col_type in (231) then
              l_colval := l_colval || 'to_timestamp('''||l_columnValue||''',''yyyy-mm-dd hh24:mi:ssff'')' || ',';
            elsif l_descTbl(i).col_type in (182) then
              l_colval := l_colval || 'to_yminterval('''||l_columnValue||''')' || ',';
            elsif l_descTbl(i).col_type in (183) then
              l_colval := l_colval ||'to_dsinterval('''||l_columnValue||''')'  || ',';
            end if;
        end loop;
        l_colval := rtrim(l_colval,',')||')';
        pipe row( l_collist  );
        pipe row( '  values ('||l_colval );
        if mod(n,p_batch) = 0 then
          pipe row('select * from dual;');
        end if;
        l_colval := null;
    end loop;
    if n = 0 then
      pipe row( 'No data found ');
    elsif mod(n,p_batch) != 0 then
      pipe row('select * from dual;');
    end if;
    pipe row('alter session set cursor_sharing = exact;');

    execute immediate 'alter session set nls_date_format='''||l_nls(1)||''' ';
    execute immediate 'alter session set nls_timestamp_format='''||l_nls(2)||''' ';
    execute immediate 'alter session set nls_timestamp_tz_format='''||l_nls(3)||''' ';
    return;
end;
/

and here’s an example of the output you’d expect to see from it


alter session set cursor_sharing = force;
insert all
into scott.emp(EMPNO,ENAME,SAL)
  values (7369,q'{SMITH}',800)
into scott.emp(EMPNO,ENAME,SAL)
  values (7499,q'{ALLEN}',1600)
into scott.emp(EMPNO,ENAME,SAL)
  values (7521,q'{WARD}',1250)
into scott.emp(EMPNO,ENAME,SAL)
  values (7566,q'{JONES}',2975)
into scott.emp(EMPNO,ENAME,SAL)
  values (7654,q'{MARTIN}',1250)
into scott.emp(EMPNO,ENAME,SAL)
  values (7698,q'{BLAKE}',2850)
into scott.emp(EMPNO,ENAME,SAL)
  values (7782,q'{CLARK}',2450)
into scott.emp(EMPNO,ENAME,SAL)
  values (7788,q'{SCOTT}',3000)
into scott.emp(EMPNO,ENAME,SAL)
  values (7839,q'{KING}',5000)
into scott.emp(EMPNO,ENAME,SAL)
  values (7844,q'{TURNER}',1500)
select * from dual;
insert all
into scott.emp(EMPNO,ENAME,SAL)
  values (7876,q'{ADAMS}',1100)
into scott.emp(EMPNO,ENAME,SAL)
  values (7900,q'{JAMES}',950)
into scott.emp(EMPNO,ENAME,SAL)
  values (7902,q'{FORD}',3000)
into scott.emp(EMPNO,ENAME,SAL)
  values (7934,q'{MILLER}',1300)
select * from dual;
alter session set cursor_sharing = exact;

Lets do a performance test on a large sample set. In this test, I’m loading a little over one million rows into an empty table, where the performance harness for each INSERT mechanism looks like the following to output the start and end times for the script:


select systimestamp from dual;
alter session set cursor_sharing = force;
set feedback off


[ Set of INSERT, or set of MULTI-TABLE inserts with 50 rows per INSERT]
set feedback on
alter session set cursor_sharing = exact;
select systimestamp from dual;

Running each gave the following results.


SQL> @c:\tmp\normal_insert.sql

SYSTIMESTAMP
---------------------------------------
17-APR-19 03.39.53.189000 PM +08:00


Session altered.


Session altered.


SYSTIMESTAMP
---------------------------------------
17-APR-19 03.43.47.920000 PM +08:00


SQL> @c:\tmp\multi_insert.sql

SYSTIMESTAMP
---------------------------------------
17-APR-19 03.43.40.402000 PM +08:00


Session altered.


Session altered.


SYSTIMESTAMP
---------------------------------------
17-APR-19 03.44.27.319000 PM +08:00

So that’s a nice four-fold speed boost, down from 3mins 54seconds to less than 50seconds.

Just to reiterate – I’m not suggesting you need to go this level for all INSERT scripts. Most of the time I just add the cursor_sharing options to my scripts, and that makes them plenty fast enough for the job. But if you have got a large number of inserts to do, then converting them to a multi-table equivalent might give you a nice boost.

Here’s the video version

Nostalgia and choosing your in-flight movie

First thing to note on this post. No tech content in this one. Just some nostalgia.

Couple of days ago, I was flying from Perth to Dubai on my way to APEX Connect in Bonn. Because this is an 11hour hell in a death tube flight I settled in to my standard sleepless task of watching movies to pass the time. I try to steer clear of going exclusively with new releases because I know I’ll always be travelling again soon, so I try not to exhaust all my options too soon Smile

I was browsing through the “Movie Classics” section and found a movie from my childhood: The Right Stuff.

If you haven’t seen it, it’s a somewhat romanticized and slightly cheesy recount of the space race from Chuck Yeager breaking the sound barrier, through to the first American in space and subsequent launches, but stopping short of the moon program. In terms of run time, this is not a movie for faint-hearted. If you think Avengers Endgame is ridiculous at 3hrs, then The Right Stuff trumps it with an additional 15 minutes on top of that.

My father loved this movie, and the first time I watched it was with him on a Sony Betamax video recorder in my early teens. We sat and watched it together, and he constantly hit pause to explain to me how significant certain milestones were. Clichéd as it might sound, it was one of those great father-son bonding moments.

To bring at least a little bit of technology into this post, it blows me away that the central character of the film, Chuck Yeager is now on Twitter and has even been gracious enough to answer a couple of my questions over the years. It amazes me – the transition from the portrayal of a figure on screen to someone I’ve managed to communicate with directly on social media.

This year marks the 20th year since my father passed away from early onset Alzheimer’s, and I’ve realised that a re-watching of The Right Stuff on the plane was perhaps a mistake. Because I’ve discovered that it tends to freak out the passenger next to you, when a grown man sits there blubbering away at a movie that doesn’t really have any sad content. So nostalgia is a good thing, but perhaps best saved for somewhere less public Smile

Love you Dad.

Partition loading in direct mode

Direct mode insert using the APPEND hint is a cool piece of technology that lets you load bulk data into a table very quickly and efficiently. Obviously there are a number of implications of doing so which you can read about here, but the one that catches most people out is the that you are locking the table during the load and once the load is completed, the table is “disabled” until the transaction ends. Here’s a quick example of that in action:


SQL> create table t
  2  as select * from dba_objects d
  3  where 1=0;

Table created.

SQL> insert /*+ APPEND */ into t
  2  select * from dba_objects d;

82417 rows created.

--
-- No further INSERTs are possible
--
SQL> insert /*+ APPEND */ into t
  2  select * from dba_objects d;
insert /*+ APPEND */ into t
                          *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel


--
-- No further DML at all is possible
--
SQL> delete from t;
delete from t
            *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel


--
-- Not even a SELECT statement is allowed
--
SQL> select count(*) from t;
select count(*) from t
                     *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel


--
-- I must end the transaction first with commit or rollback before normal service is resumed
--
SQL> commit;

Commit complete.

SQL> select count(*) from t;

  COUNT(*)
----------
     82417

This makes sense given that a direct mode insert is manipulating the high water mark (HWM) for a table, so it pretty much has to be an all or nothing process for the session issuing the load, because the HWM is in a state of flux until we commit or rollback.

However, what about a partitioned table? Can I do a direct mode insert into a single partition, whilst leaving the other partitions available in the same session? Let’s try it out.


SQL> create table t
  2  partition by list ( x )
  3  ( partition p1 values (1),
  4    partition p2 values (2)
  5  )
  6  as select 1 x, d.* from dba_objects d
  7  where 1=0;

Table created.

--
-- Only load into partition P1
--
SQL> insert /*+ APPEND */ into t partition (p1)
  2  select 1 x, d.* from dba_objects d;

82419 rows created.

--
-- And now see if partition P2 can be loaded as part of the same transaction
--
SQL> insert /*+ APPEND */ into t partition (p2)
  2  select 2 x, d.* from dba_objects d;
insert /*+ APPEND */ into t partition (p2)
                          *
ERROR at line 1:
ORA-12838: cannot read/modify an object after modifying it in parallel

Unfortunately not. Even though we never touched partition P2 with the first INSERT, and partition P2 is a different physical segment, we still cannot do additional direct loads on it. One easy workaround to this is to place that second load in a separate transaction, for example:


SQL> declare
  2    pragma autonomous_transaction;
  3  begin
  4    insert /*+ APPEND */ into t partition (p2)
  5    select 2 x, d.* from dba_objects d;
  6    commit;
  7  end;
  8  /

PL/SQL procedure successfully completed.

You’re probably thinking “That’s a bit silly. If we’re going to commit for each partition, then why wouldn’t we just commit after the load of the first partition P1 anyway?”. That’s a valid point, but what the above example shows is that you can do direct path loads into separate partitions concurrently. This opens up opportunities for (dramatically) increasing the throughput of direct path loads if you can segment the source data into its designated target partitions at load time. I’ll extend the table above to have 10 partitions, and use a little DBMS_JOB code to now load 10 partitions all concurrently.


SQL> create table t
  2  partition by list ( x )
  3  ( partition p1 values (1),
  4    partition p2 values (2),
  5    partition p3 values (3),
  6    partition p4 values (4),
  7    partition p5 values (5),
  8    partition p6 values (6),
  9    partition p7 values (7),
 10    partition p8 values (8),
 11    partition p9 values (9),
 12    partition p10 values (10)
 13  )
 14  as select 1 x, d.* from dba_objects d
 15  where 1=0;

Table created.

SQL>
SQL> declare
  2    j int;
  3    l_sql varchar2(200) :=
  4      'begin
  5         insert into t partition (p@)
  6         select @ x, d.* from dba_objects d;
  7         commit;
  8       end;';
  9  begin
 10    for i in 1 .. 10 loop
 11      dbms_job.submit(j,replace(l_sql,'@',i));
 12    end loop;
 13    
 14  end;
 15  /

PL/SQL procedure successfully completed.

SQL>
SQL> select job, what from user_jobs
  2  where what like '%dba_objects%';

       JOB WHAT
---------- --------------------------------------------------
       161 begin
           insert into t partition (p1)
           select 1 x, d.* from dba_objects d;
           commit;
           end;

       162 begin
           insert into t partition (p2)
           select 2 x, d.* from dba_objects d;
           commit;
           end;

       163 begin
           insert into t partition (p3)
           select 3 x, d.* from dba_objects d;
           commit;
           end;

       164 begin
           insert into t partition (p4)
           select 4 x, d.* from dba_objects d;
           commit;
           end;

       165 begin
           insert into t partition (p5)
           select 5 x, d.* from dba_objects d;
           commit;
           end;

       166 begin
           insert into t partition (p6)
           select 6 x, d.* from dba_objects d;
           commit;
           end;

       167 begin
           insert into t partition (p7)
           select 7 x, d.* from dba_objects d;
           commit;
           end;

       168 begin
           insert into t partition (p8)
           select 8 x, d.* from dba_objects d;
           commit;
           end;

       169 begin
           insert into t partition (p9)
           select 9 x, d.* from dba_objects d;
           commit;
           end;

       170 begin
           insert into t partition (p10)
           select 10 x, d.* from dba_objects d;
           commit;
           end;


10 rows selected.

SQL>
SQL> commit;

Commit complete.

SQL> select subobject_name, cnt
  2  from (
  3    select dbms_rowid.rowid_object(rowid) obj, count(*) cnt
  4    from   t
  5    group by dbms_rowid.rowid_object(rowid)
  6    ), user_objects
  7  where obj = data_object_id
  8  order by to_number(substr(subobject_name,2));

SUBOBJECT_        CNT
---------- ----------
P1              82427
P2              82427
P3              82427
P4              82427
P5              82427
P6              82427
P7              82427
P8              82427
P9              82427
P10             82427

And voila! 10 partitions all loaded in direct mode concurrently! There is also a standard means of loading data concurrently using parallel DML, but if I know the segmentation of the data in advance, the “manual” method above can sometimes be a quicker and easier to debug option.