Why you might add columns to external tables

Posted by

Let me start with the idiotic part first. That part would be…. me! Smile

I’ll create an external table which reads some customer sales data from a flat file.


SQL> create table sales_ext
  2        (
  3        cust_id varchar2(10) not null,
  4        product_id int not null,
  5        amt     number,
  6        dte date
  7        )
  8  organization external (
  9    type oracle_loader
 10    default directory temp
 11    access parameters (
 12      records delimited by newline
 13      fields terminated by ','
 14      (
 15                      cust_id,
 16                      product_id,
 17                      amt,
 18                      dte DATE MASK "DD-MON-YYYY"
 19              )
 20    )
 21    location ('s2017.dat')
 22  );

Table created.

SQL>
SQL> select * from sales_ext where rownum <= 10;

CUST_ID    PRODUCT_ID        AMT DTE
---------- ---------- ---------- ---------
CUST_A           1001        100 20-JAN-17
CUST_B           1001        110 20-JAN-17
CUST_C           1001        200 21-JAN-17
CUST_D           1000        108 22-JAN-17
CUST_E           1002         10 24-JAN-17
CUST_A           1001        100 20-JAN-17
CUST_B           1001        110 20-JAN-17
CUST_C           1001        200 21-JAN-17
CUST_D           1000        108 22-JAN-17
CUST_E           1002         10 24-JAN-17

10 rows selected.

So far so good. Now what happens if I try add a column to that table? Interestingly enough, the database has no concerns with me doing so. It will happily update the dictionary definition:


SQL> alter table sales_ext add newcol int;

Table altered.

That presents somewhat of a contradiction though. An external table is based on a flat file structure. It seems to make no sense to add columns to such a table? After all, I can’t do any DML on that table anyway. Adding virtual columns would certainly be a sensible option, but a normal column? There doesn’t seem to be a use case here. And unsurprisingly, having added the column, the database gets most confused when I try to query my external table:


SQL> select * from sales_ext where rownum <= 10;
select * from sales_ext where rownum <= 10
*
ERROR at line 1:
ORA-29913: error in executing ODCIEXTTABLEOPEN callout
ORA-29400: data cartridge error
KUP-04043: table column not found in external source: NEWCOL

Clearly I need to bring the dictionary column definitions into alignment with the mapping to the external data source (or the “SQL Loader part” of the table definition as I like to think of it). The ACCESS PARAMETERS need to indicate how to access this new column.


SQL> alter table sales_ext
  2  access parameters (
  3      records delimited by newline
  4      fields terminated by ','
  5      (
  6                      cust_id,
  7                      product_id,
  8                      amt,
  9                      dte DATE MASK "DD-MON-YYYY",
 10                      newcol
 11              )
 12    );

Table altered.

But even that is not enough to move forward. Because whilst I have a new dictionary column and a definition of how to access that new column from the external file, obviously if that data is not actually in the existing file, then the I am going to hit problems again when the database discovers it is missing information in the file.


SQL>
SQL> select * from sales_ext where rownum <= 10;
select * from sales_ext where rownum <= 10
*
ERROR at line 1:
ORA-29913: error in executing ODCIEXTTABLEFETCH callout
ORA-30653: reject limit reached

To workaround this, I need to amend my external table definition to let the database know that if it cannot find information for each column in the external file, it can treat the column as null.


SQL> alter table sales_ext
  2  access parameters (
  3      records delimited by newline
  4      fields terminated by ','
  5      MISSING FIELD VALUES ARE NULL
  6      (
  7                      cust_id,
  8                      product_id,
  9                      amt,
 10                      dte DATE MASK "DD-MON-YYYY",
 11                      newcol
 12              )
 13    );

Table altered.

And finally I can query my external table, resplendent with its new column which of course will always be null.


SQL> select * from sales_ext where rownum <= 10;

CUST_ID    PRODUCT_ID        AMT DTE           NEWCOL
---------- ---------- ---------- --------- ----------
CUST_A           1001        100 20-JAN-17
CUST_B           1001        110 20-JAN-17
CUST_C           1001        200 21-JAN-17
CUST_D           1000        108 22-JAN-17
CUST_E           1002         10 24-JAN-17
CUST_A           1001        100 20-JAN-17
CUST_B           1001        110 20-JAN-17
CUST_C           1001        200 21-JAN-17
CUST_D           1000        108 22-JAN-17
CUST_E           1002         10 24-JAN-17

10 rows selected.

A waste of time?

This entire exercise seems ridiculous, because if I have a new file format, and I need to change the dictionary definition, and I need to change the access parameters, then why not just recreate the whole table? That seems an easier prospect, and the concept of adding columns to an external table just seems idiotic.

But bear with me…clarity is within our grasp Smile

What if your table is not just an external table? What if your table is not just a database table? It might be both.

In 19c, we have a hybrid partitioned table. This is table with both external and “internal” (aka dictionary) partitions. Now the ability to add columns make a good deal of sense, because you may need to alter the definition of the dictionary based components of the table, and yet still allow it to seamlessly integrate with the external partitions.  Here’s the same example from above, but with a hybrid partitioned table.


SQL> create table hybrid_sales
  2        (
  3        cust_id varchar2(10) not null,
  4        product_id int not null,
  5        amt     number,
  6        dte date
  7        )
  8          external partition attributes (
  9            type oracle_loader
 10            default directory temp
 11            ACCESS PARAMETERS (
 12                FIELDS CSV WITH EMBEDDED TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
 13                MISSING FIELD VALUES ARE NULL (
 14                  cust_id,
 15                  product_id,
 16                  amt,
 17                  dte DATE MASK "DD-MON-YYYY"
 18              )
 19           )
 20        )
 21         partition by range (dte)
 22         (partition sales_2017 values less than (date '2018-01-01') external location ('s2017.dat'),
 23          partition sales_2018 values less than (date '2019-01-01')
 24      );

Table created.

SQL>
SQL> select * from hybrid_sales where rownum <= 10;

CUST_ID    PRODUCT_ID        AMT DTE
---------- ---------- ---------- ---------
CUST_A           1001        100 20-JAN-17
CUST_B           1001        110 20-JAN-17
CUST_C           1001        200 21-JAN-17
CUST_D           1000        108 22-JAN-17
CUST_E           1002         10 24-JAN-17
CUST_A           1001        100 20-JAN-17
CUST_B           1001        110 20-JAN-17
CUST_C           1001        200 21-JAN-17
CUST_D           1000        108 22-JAN-17
CUST_E           1002         10 24-JAN-17

10 rows selected.

SQL>
SQL> alter table hybrid_sales add newcol int;

Table altered.

SQL>
SQL> alter table hybrid_sales
  2     ACCESS PARAMETERS (
  3                FIELDS CSV WITH EMBEDDED TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
  4                MISSING FIELD VALUES ARE NULL (
  5                  cust_id,
  6                  product_id,
  7                  amt,
  8                  dte DATE MASK "DD-MON-YYYY",
  9                  newcol
 10              ));

Table altered.

SQL>
SQL> select * from hybrid_sales where rownum <= 10;

CUST_ID    PRODUCT_ID        AMT DTE           NEWCOL
---------- ---------- ---------- --------- ----------
CUST_A           1001        100 20-JAN-17
CUST_B           1001        110 20-JAN-17
CUST_C           1001        200 21-JAN-17
CUST_D           1000        108 22-JAN-17
CUST_E           1002         10 24-JAN-17
CUST_A           1001        100 20-JAN-17
CUST_B           1001        110 20-JAN-17
CUST_C           1001        200 21-JAN-17
CUST_D           1000        108 22-JAN-17
CUST_E           1002         10 24-JAN-17

10 rows selected.

SQL>

It seems no different from an external table, but this new column can be referenced, populated and changed like any other standard database column without compromising our ability to integrate the entire table with existing flat file data. That’s pretty cool.


SQL> insert into HYBRID_SALES
  2  values (10,10,10,date '2018-06-01',999);

1 row created.

SQL> select * from HYBRID_SALES
  2  where NEWCOL is not null;

CUST_ID    PRODUCT_ID        AMT DTE           NEWCOL
---------- ---------- ---------- --------- ----------
10                 10         10 01-JUN-18        999

SQL> update HYBRID_SALES
  2  set NEWCOL = 900
  3  where NEWCOL = 999;

1 row updated.

Got some thoughts? Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.