Just a quick heads up on the importance of testing thoroughly especially when exploring niche sets of operating conditions with the database (or any software for that matter). Here’s one that came up recently when direct load SQL*Loader was combined with In-Database archiving.

First I’ll create an empty DEPT table in my schema.


SQL> create table dept as select * from scott.dept where 1=0;
 
Table created.

Then I’ll load some data into it with SQL*Loader using the following control file. Note that we are using a direct load here


OPTIONS (DIRECT=TRUE)
LOAD DATA
INFILE *
INTO TABLE dept
APPEND
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
(deptno, dname, loc)
BEGINDATA
12,RESEARCH,"SARATOGA"
10,"ACCOUNTING",CLEVELAND
11,"ART",SALEM
13,FINANCE,"BOSTON"
21,"SALES",PHILA.
22,"SALES",ROCHESTER
42,"INT'L","SAN FRAN"
 
D:\ >sqlldr demo/demo control=dept.ctl
 
SQL*Loader: Release 19.0.0.0.0 - Production on Thu May 23 12:02:22 2024
Version 19.22.0.0.0
 
Copyright (c) 1982, 2024, Oracle and/or its affiliates.  All rights reserved.
 
Path used:      Direct
 
Load completed - logical record count 7.
 
Table DEPT:
  7 Rows successfully loaded.
 
Check the log file:
  dept.log
for more information about the load.

Unsurprisingly, 7 rows were loaded and I can easily query those rows from the database.


SQL> select * from dept;

    DEPTNO DNAME          LOC
---------- -------------- -------------
        12 RESEARCH       SARATOGA
        10 ACCOUNTING     CLEVELAND
        11 ART            SALEM
        13 FINANCE        BOSTON
        21 SALES          PHILA.
        22 SALES          ROCHESTER
        42 INT'L          SAN FRAN

7 rows selected.

I’ll now truncate the table to bring it back to its initial state, and also start to use In-Database archival for this table.


SQL> truncate table dept;

Table truncated.

SQL> alter table dept row archival;
 
Table altered.

I’ll re-run my SQL Loader and everything still looks fine.


D:\ >sqlldr demo/demo control=dept.ctl
 
SQL*Loader: Release 19.0.0.0.0 - Production on Thu May 23 12:02:22 2024
Version 19.22.0.0.0
 
Copyright (c) 1982, 2024, Oracle and/or its affiliates.  All rights reserved.
 
Path used:      Direct
 
Load completed - logical record count 7.
 
Table DEPT:
  7 Rows successfully loaded.
 
Check the log file:
  dept.log
for more information about the load.

But I then get a nasty shock when I go to query that table in the database.


SQL> select * From dept;
 
no rows selected

My rows are gone!

In fact, the first clue that my rows might not actually be gone is by looking at the optimizer statistics.


SQL> exec dbms_stats.gather_table_stats('','DEPT')

PL/SQL procedure successfully completed.

SQL> select num_rows from user_tables
  2  where table_name = 'DEPT';

  NUM_ROWS
----------
         7

Even though I cannot see the rows, the database seems to think they are there. If I alter my session to flip the row archival flag to see all rows, then lo and behold my rows magically re-appear.


SQL> alter session set row archival visibility = all;

Session altered.

SQL> select * from dept;

    DEPTNO DNAME          LOC
---------- -------------- -------------
        12 RESEARCH       SARATOGA
        10 ACCOUNTING     CLEVELAND
        11 ART            SALEM
        13 FINANCE        BOSTON
        21 SALES          PHILA.
        22 SALES          ROCHESTER
        42 INT'L          SAN FRAN

7 rows selected.

The cause here is in how in-database archiving is implemented. Taking a look at USER_TAB_COLS you can see that some hidden columns are added to activate the feature, in particular, the ORA_ARCHIVE_STATE column.


SQL> select column_name
  2  from   user_tab_cols
  3  where  table_name = 'DEPT';

COLUMN_NAME
------------------------------
DEPTNO
DNAME
LOC
SYS_NC00004$
ORA_ARCHIVE_STATE

As per the documentation, when a table has in-database archiving activated, an “active” row has an ORA_ARCHIVE_STATE value of 0 (in fact, varchar2 ‘0’), and any other value indicates that the row has been archived, and hence should not be visible by default. Most typically, people set the column value to ‘1’ to make a row as archived (or logically deleted).

The issue here is that SQL*Loader direct load omits taking care of this hidden column, and thus the column value is set to null during load.


SQL> select deptno, ora_archive_state
  2  from   dept;

    DEPTNO ORA_ARCHIVE_STATE
---------- --------------------
        12
        10
        11
        13
        21
        22
        42

7 rows selected.

Because the value is not ‘0’, the rows are deemed archived and hence not visible unless we explicitly set the session row archival visibility to ‘all’ as we did above. To correct the data, I can update the loaded rows to have the correct value for ORA_ARCHIVE_STATE


SQL> update dept
  2  set ora_archive_state = '0';

7 rows updated.

after which they can now be seen as expected.


SQL> alter session set row archival visibility = active;

Session altered.

SQL> select deptno, ora_archive_state
  2  from   dept;

    DEPTNO ORA_ARCHIVE_STATE
---------- --------------------
        12 0
        10 0
        11 0
        13 0
        21 0
        22 0
        42 0

7 rows selected.

Of course, having to update the rows defeats the purpose of efficiently loading rows in direct mode, and you’ll be pleased to know that this bug in SQL*Loader is being addressed. It is fixed in the latest version of the Oracle Database and the fix will probably make its way through to other versions via the standard RU mechanisms. In the interim, if you are using SQL*Loader and in-database archiving, then I recommend you stick with conventional mode loading until you have the fix.

But there is an even better solution and that is to switch to external tables. More power, more flexibility and no issues with in-database archiving


SQL> create table dept as select * from scott.dept where 1=0;

Table created.

SQL>
SQL> alter table dept row archival;

Table altered.

SQL>
SQL> insert /*+ APPEND */ into dept
  2  select *
  3  from   external (
  4        (deptno int, dname varchar2(30), loc varchar2(30))
  5        type oracle_loader
  6        default directory ctmp
  7        access parameters
  8        ( records delimited by newline
  9          fields terminated by ',' optionally enclosed by '"'
 10          (deptno , dname , loc )
 11         )
 12         location ( 'dept.dat' )
 13  reject limit unlimited ) ext;

7 rows created.

SQL> commit;

Commit complete.

SQL> select * from dept;

    DEPTNO DNAME          LOC
---------- -------------- -------------
        12 RESEARCH       SARATOGA
        10 ACCOUNTING     CLEVELAND
        11 ART            SALEM
        13 FINANCE        BOSTON
        21 SALES          PHILA.
        22 SALES          ROCHESTER
        42 INT'L          SAN FRAN

7 rows selected.

4 responses to “SQL*Loader rows vanish!”

  1. Shouldn’t this hidden column become a Boolean in 23ai? 😉

    Also, Flashback Time Travel (ne: Flashback Data Archive) is, to me, a better solution for this type of ‘we want to leave old records in a table but not have them visible by default’ type of functionality.

    If you are using Flashback Time Travel, you’d just delete the ‘data you are archiving’ and it would no longer be in the table, but you’d be able to see it in the Flashback for as long as it was retained.

    Is there any advantage to row archival over Flashback Time Travel?

    1. > Shouldn’t this hidden column become a Boolean in 23ai

      no. *Common* values are 1 and 0 , but you can use anything you like.

      > Is there any advantage to row archival over Flashback Time Travel?

      You can partition by this column, which gives you a little more flexibility, but they are not mutually exclusive options

      1. From a business perspective it is a Boolean (either the row is archived or it isn’t), so I’d argue that the data type should be Boolean.

        But I do realize there is much in the Oracle data dictionary that is there for historical reasons and, for some reason, the dictionary team seems to often use the ‘data type that just happens to work with the code that they wrote’ instead of the ‘data type that matches the business requirements’.

        1. my point was – by using different values, you could come up with a more flexible partitioning mechanism, eg 0, 2021, 2023, 2024 etc

Leave a reply to Rich Soule Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Trending