The death of UTL_FILE – part 2 – Learning is not a spectator sport

I wrote a post a while back call “The Death of UTL_FILE”, and probably because of it’s click-bait title I got lots of feedback, so I’m back to flog that horse Smile . Seriously though, I stand behind my assertion in that post, that the majority of usages of UTL_FILE I’ve seen my career are mimicking the spooling behaviour of a SQL*Plus script. And as that post pointed out, you can now achieve that functionality directly with the scheduler.

That is well and good for writing files from the database, and I added:

“I’ll come back to UTL_FILE to read files in a future post”.

That time has come! Many developers will be aware of the feature known as external tables but a criticism of those has been that whereas UTL_FILE only needs permissions on the appropriate directory object to read a file, an external table requires a DDL definition, which typically means a lot of red tape and processes to follow when trying to implement changes.

Some of that pain was alleviated in 12c Release 2 when it became possible to modify portions of the external table definition on the fly at query execution time. For example, if you needed to change the name of the file name each day, then rather than having to re-craft an external table DDL, or even an ‘alter’ command, it could be done as required by the query:


SQL> select * from ext_emp
  2    external modify ( location ('emp20161002.dat') );

     EMPNO ENAME      JOB              MGR HIREDATE         SAL
---------- ---------- --------- ---------- --------- ---------- ---
      7902 FORD       ANALYST         7566 03-DEC-81       3000
      7934 MILLER     CLERK           7782 23-JAN-82       1300        
      7566 JONES      MANAGER         7839 02-APR-81       2975

That still of course does not alleviate the need for the initial table DDL. But in 18c, you can query an external file and the only thing you need to do so is appropriate privileges on the directory object. Sounds a lot like UTL_FILE, no? In 18c, we can query an external source directly with query. Here’s a simple example using the following data file (emp.dat) as a source:


7369,SMITH,CLERK,7902,800,,20
7499,ALLEN,SALESMAN,7698,1600,300,30
7521,WARD,SALESMAN,7698,1250,500,30
7566,JONES,MANAGER,7839,2975,,20
7654,MARTIN,SALESMAN,7698,1250,1400,30
7698,BLAKE,MANAGER,7839,2850,,30
7782,CLARK,MANAGER,7839,2450,,10
7788,SCOTT,ANALYST,7566,3000,,20
7839,KING,PRESIDENT,,5000,,10
7844,TURNER,SALESMAN,7698,1500,,30
7876,ADAMS,CLERK,7788,1100,,20
7900,JAMES,CLERK,7698,950,,30
7902,FORD,ANALYST,7566,3000,,20
7934,MILLER,CLERK,7782,1300,,10

The query (in this case) just requires the EXTERNAL modifier and a definition of the columns and we’re done!


SQL> select * from
  2  external (
  3     (
  4       empno     number(4),
  5       ename     varchar2(12),
  6       job       varchar2(12),
  7       mgr       number(4),
  8       sal       number(7,2),
  9       comm      number(7,2),
 10       deptno    number(2)
 11     )
 12  type oracle_loader
 13  default directory TMP
 14  location ('emp.dat')
 15  );

     EMPNO ENAME        JOB                 MGR        SAL       COMM     DEPTNO
---------- ------------ ------------ ---------- ---------- ---------- ----------
      7369 SMITH        CLERK              7902        800                    20
      7499 ALLEN        SALESMAN           7698       1600        300         30
      7521 WARD         SALESMAN           7698       1250        500         30
      7566 JONES        MANAGER            7839       2975                    20
      7654 MARTIN       SALESMAN           7698       1250       1400         30
      7698 BLAKE        MANAGER            7839       2850                    30
      7782 CLARK        MANAGER            7839       2450                    10
      7788 SCOTT        ANALYST            7566       3000                    20
      7839 KING         PRESIDENT                     5000                    10
      7844 TURNER       SALESMAN           7698       1500                    30
      7876 ADAMS        CLERK              7788       1100                    20
      7900 JAMES        CLERK              7698        950                    30
      7902 FORD         ANALYST            7566       3000                    20
      7934 MILLER       CLERK              7782       1300                    10

14 rows selected.

Of course, UTL_FILE is more about totally flexibility in the content your read. After all, you (typically) read a line from the file and then parse it however you please. But the same is possible with the 18c syntax. Here I’ll take some free format text from the Oracle documentation in a file plain.dat.


9.5 System Triggers
A system trigger is created on either a schema or the database.
Its triggering event is composed of either DDL statements (listed in "ddl_event") or database operation statements (listed in "database_event").
A system trigger fires at exactly one of these timing points:
Before the triggering statement runs
(The trigger is called a BEFORE statement trigger or statement-level BEFORE trigger.)
After the triggering statement runs
(The trigger is called a AFTER statement trigger or statement-level AFTER trigger.)
Instead of the triggering CREATE statement
(The trigger is called an INSTEAD OF CREATE trigger.)
Topics
SCHEMA Triggers
DATABASE Triggers
INSTEAD OF CREATE Triggers
9.5.1 SCHEMA Triggers
A SCHEMA trigger is created on a schema and fires whenever the user who owns it is the current user and initiates the triggering event.
Suppose that both user1 and user2 own schema triggers, and user1 invokes a DR unit owned by user2. Inside the DR unit, user2 is the current user. Therefore, if the DR unit initiates the triggering event of a schema trigger that user2 owns, then that trigger fires. However, if the DR unit initiates the triggering event of a schema trigger that user1 owns, then that trigger does not fire.
Example 9-19 creates a BEFORE statement trigger on the sample schema HR. When a user connected as HR tries to drop a database object, the database fires the trigger before dropping the object.

To mimic the line-by-line read style of UTL_FILE, I can simple define a single column holding a line from the data file.


SQL> select * from
  2  external (
  3     (
  4       text varchar2(4000)
  5     )
  6  type oracle_loader
  7  default directory TMP
  8  location ('plain.dat')
  9  );

TEXT
------------------------------------------------------------
9.5 System Triggers
A system trigger is created on either a schema or the databa
se.

Its triggering event is composed of either DDL statements (l
isted in "ddl_event") or database operation statements (list
ed in "database_event").

A system trigger fires at exactly one of these timing points
:

Before the triggering statement runs
(The trigger is called a BEFORE statement trigger or stateme
nt-level BEFORE trigger.)

After the triggering statement runs
(The trigger is called a AFTER statement trigger or statemen
t-level AFTER trigger.)

Instead of the triggering CREATE statement
(The trigger is called an INSTEAD OF CREATE trigger.)
Topics
SCHEMA Triggers
DATABASE Triggers
INSTEAD OF CREATE Triggers
9.5.1 SCHEMA Triggers
A SCHEMA trigger is created on a schema and fires whenever t
he user who owns it is the current user and initiates the tr
iggering event.

Suppose that both user1 and user2 own schema triggers
Example 9-19 creates a BEFORE statement trigger on the sampl
e schema HR. When a user connected as HR tries to drop a dat
abase object


18 rows selected.

So there is plenty of potential there to reduce the amount of source code to maintain using this nice syntax. The line between database data and external data gets blurrier, which is a great convenience for developers and ad-hoc users in being able to bring all the power of SQL to more data sources.

Footnote: When you query an external data source in 18c in the way mentioned above, behind the scenes, the database may create global temporary tables to support the operation. In the first example above, the database performed the following DDL recursively:


CREATE GLOBAL TEMPORARY TABLE "SCOTT"."SYS_TEMP_0FD9D678B_BDA818A8" SHARING=NONE  
("EMPNO" NUMBER(4),
 "ENAME" VARCHAR2(12) COLLATE "USING_NLS_COMP" ,
 "JOB" VARCHAR2(12) COLLATE "USING_NLS_COMP" ,
 "MGR" NUMBER(4),"SAL" NUMBER(7,2),
 "COMM" NUMBER(7,2),
 "DEPTNO" NUMBER(2) 
) 
IN_MEMORY_METADATA CURSOR_SPECIFIC_SEGMENT INLINE_XT  
STORAGE (OBJNO 4254951307 ) NOPARALLEL

That should not be cause for concern, but if you have hundreds of differently structured external data sources, and/or triggers that fire on DDL operations on the database, then you might see some impact. As with all things, thorough testing should always precede production implementation.

2 responses to “The death of UTL_FILE – part 2”

Ricardo Oberdan

March 13, 2019 at 7:08 pm

It’s wise use external tables to read a flat file. I use UTL_FILE to read gigabytes flat text files, make some business transformations and load into tables. It will be easier to code and maintain if I can use only SELECT * FROM “flat_file”. My question is about performance. Use external tables and selecting millions of records will be faster than use UTL_FILE?

Jonathan Taylor

April 13, 2019 at 12:42 am

BFILENAME() is another choice, returns the file a BFILE lob that can be converted to BLOB or CLOB.
Especially useful if you aren’t reading the file one line at a time (e.g. binary files, XML etc).