In a previous post I covered a technique to improve the performance of UTL_FILE, but concluded the post with a teaser: “you probably don’t need to use UTL_FILE ever again”.
Time for me to back that statement up with some concrete evidence.
UTL_FILE can read and write files. This blog post will cover the writing functionality of UTL_FILE and why I think you probably don’t need UTL_FILE for this. I’ll come back to UTL_FILE to read files in a future post.
There are two possibilities when it comes to writing a file:
- The file is being requested by a client program and the results should be stored on a client machine. In that case, you do not use UTL_FILE anyway – you use the tools available on the client, for example, SQL Developer, SQLcl, SQL*Plus or a myriad of other tools, OR
- the file is to be written by the database server to a location on the database server itself, or to a location that is accessible to the database server. This is where UTL_FILE has been used, the main motivations being that it can be part of existing database-resident PL/SQL code, and does not require clients to have direct SQL or OS level access to the database server.
So here is my assertion – most of the time, UTL_FILE is used to write out a file because we can’t make use an easier tool like those mentioned above directly on the database server. After all, who wants to write code like:
SQL> declare
2 f utl_file.file_type;
3 line varchar2(255);
4 s timestamp;
5 begin
6 f := utl_file.fopen('TMP','demo.dat','w');
7 s := systimestamp;
8 for i in ( select * from t )
9 loop
10 line := i.object_id||','||i.object_name||','||i.object_type||','||i.data_object_id;
11 utl_file.put_line(f,line);
12 end loop;
13 dbms_output.put_line(systimestamp-s);
14 utl_file.fclose_all;
15 end;
16 /
PL/SQL procedure successfully completed.
when most of the time if I was doing this in a tool it would be simply:
SQL> set markup csv on
SQL> spool c:\tmp\demo.dat
SQL> select * from t;
SQL> spool off
But in recent versions of the database, you can do exactly this! The database scheduler has been enhanced to be able to to run SQL*Plus style scripts directly out of the database without needing to give OS access to database server, or the SQL*Plus executable. In the previous post I unloaded 40million rows to a file in CSV format and the responsibility for formatting the data into CSV format fell to me – I had to do all of the “heavy lifting”. Using the scheduler and the SQL_SCRIPT job type, it is as simple as writing a SQL*Plus script, and submitting it as a job.
SQL> declare
2 l_script VARCHAR2(32767) :=
3 'conn /@db
4 set markup csv on
5 set arraysize 500
6 set pages 0
7 set lines 200
8 set trimspool on
9 spool c:\tmp\demo_sched.dat
10 select * from t;
11 spool off';
12 begin
13 dbms_scheduler.create_job(
14 job_name => 'UNLOAD_DATA',
15 job_type => 'SQL_SCRIPT',
16 job_action => l_script,
17 credential_name => 'MY_ACCOUNT',
18 enabled => true
19 );
20 end;
21 /
PL/SQL procedure successfully completed.
You will want to ensure that you have some controls over the usage of this feature, and what credentials the scripts will run under. Also in my example, I’ve got a connection wallet setup so that I do not have to code any passwords into the connection string for my scheduler job. But suddenly it has become easy to get access to the scripting tools we are used to on our own client machines, and utilize them on the database server.
“Slow down there cowboy…” I hear you exclaim. “…That might be fine for simple SELECT * scripts, but my UTL_FILE procedures have a lot of complex logic to construct the file data”.
Even if you are using UTL_FILE because you are performing some complicated algorithms to generate the data that will be written to file, you can still utilize this scheduler facility. After your complicated logic is completed, ultimately you typically will have a line of data you need to write to a file. And if you have a line of data, then it is trivial to port that procedure to become a pipelined function. And once you have a pipelined function, then we have a simple query mechanism that can be used to spool the output. For example, if my original procedure is:
SQL> create or replace
2 procedure my_procedure is
3 begin
4 for each_row in ( [some set] )
5 loop
6 ...
7 ... my complex logic
8 ...
9
10 ...
11 ... writing each line to file with UTL_FILE
12 ...
13 end loop;
14 end;
15 /
then we change none of the logic – we only need replace all of that messy UTL_FILE code with a simple pipe command to allow querying that function as if it was source of rows, and then spool it to a file in the usual way.
SQL> create or replace
2 function my_function return sys.odcivarchar2list pipelined is
3 begin
4 for each_row in ( [some set] )
5 loop
6 ...
7 ... my complex logic
8 ...
9
10 ...
11 ... pipe row ( the_line );
12 ...
13 end loop;
14 return;
15 end;
16 /
SQL> spool c:\tmp\demo.dat
SQL> select * from my_function();
SQL> spool off
So next time you’re cranking out some cumbersome UTL_FILE code to write a file, take a moment to see if the scheduler can look after some of the work for you. I’ll talk about (not) using UTL_FILE to read files in a future post.
Got some thoughts? Leave a comment