In a previous post I covered a technique to improve the performance of UTL_FILE, but concluded the post with a teaser: “you probably don’t need to use UTL_FILE ever again”.
Time for me to back that statement up with some concrete evidence.
UTL_FILE can read and write files. This blog post will cover the writing functionality of UTL_FILE and why I think you probably don’t need UTL_FILE for this. I’ll come back to UTL_FILE to read files in a future post.
There are two possibilities when it comes to writing a file:
- The file is being requested by a client program and the results should be stored on a client machine. In that case, you do not use UTL_FILE anyway – you use the tools available on the client, for example, SQL Developer, SQLcl, SQL*Plus or a myriad of other tools, OR
- the file is to be written by the database server to a location on the database server itself, or to a location that is accessible to the database server. This is where UTL_FILE has been used, the main motivations being that it can be part of existing database-resident PL/SQL code, and does not require clients to have direct SQL or OS level access to the database server.
So here is my assertion – most of the time, UTL_FILE is used to write out a file because we can’t make use an easier tool like those mentioned above directly on the database server. After all, who wants to write code like:
SQL> declare 2 f utl_file.file_type; 3 line varchar2(255); 4 s timestamp; 5 begin 6 f := utl_file.fopen('TMP','demo.dat','w'); 7 s := systimestamp; 8 for i in ( select * from t ) 9 loop 10 line := i.object_id||','||i.object_name||','||i.object_type||','||i.data_object_id; 11 utl_file.put_line(f,line); 12 end loop; 13 dbms_output.put_line(systimestamp-s); 14 utl_file.fclose_all; 15 end; 16 / PL/SQL procedure successfully completed.
when most of the time if I was doing this in a tool it would be simply:
SQL> set markup csv on SQL> spool c:\tmp\demo.dat SQL> select * from t; SQL> spool off
But in recent versions of the database, you can do exactly this! The database scheduler has been enhanced to be able to to run SQL*Plus style scripts directly out of the database without needing to give OS access to database server, or the SQL*Plus executable. In the previous post I unloaded 40million rows to a file in CSV format and the responsibility for formatting the data into CSV format fell to me – I had to do all of the “heavy lifting”. Using the scheduler and the SQL_SCRIPT job type, it is as simple as writing a SQL*Plus script, and submitting it as a job.
SQL> declare 2 l_script VARCHAR2(32767) := 3 'conn /@db 4 set markup csv on 5 set arraysize 500 6 set pages 0 7 set lines 200 8 set trimspool on 9 spool c:\tmp\demo_sched.dat 10 select * from t; 11 spool off'; 12 begin 13 dbms_scheduler.create_job( 14 job_name => 'UNLOAD_DATA', 15 job_type => 'SQL_SCRIPT', 16 job_action => l_script, 17 credential_name => 'MY_ACCOUNT', 18 enabled => true 19 ); 20 end; 21 / PL/SQL procedure successfully completed.
You will want to ensure that you have some controls over the usage of this feature, and what credentials the scripts will run under. Also in my example, I’ve got a connection wallet setup so that I do not have to code any passwords into the connection string for my scheduler job. But suddenly it has become easy to get access to the scripting tools we are used to on our own client machines, and utilize them on the database server.
“Slow down there cowboy…” I hear you exclaim. “…That might be fine for simple SELECT * scripts, but my UTL_FILE procedures have a lot of complex logic to construct the file data”.
Even if you are using UTL_FILE because you are performing some complicated algorithms to generate the data that will be written to file, you can still utilize this scheduler facility. After your complicated logic is completed, ultimately you typically will have a line of data you need to write to a file. And if you have a line of data, then it is trivial to port that procedure to become a pipelined function. And once you have a pipelined function, then we have a simple query mechanism that can be used to spool the output. For example, if my original procedure is:
SQL> create or replace 2 procedure my_procedure is 3 begin 4 for each_row in ( [some set] ) 5 loop 6 ... 7 ... my complex logic 8 ... 9 10 ... 11 ... writing each line to file with UTL_FILE 12 ... 13 end loop; 14 end; 15 /
then we change none of the logic – we only need replace all of that messy UTL_FILE code with a simple pipe command to allow querying that function as if it was source of rows, and then spool it to a file in the usual way.
SQL> create or replace 2 function my_function return sys.odcivarchar2list pipelined is 3 begin 4 for each_row in ( [some set] ) 5 loop 6 ... 7 ... my complex logic 8 ... 9 10 ... 11 ... pipe row ( the_line ); 12 ... 13 end loop; 14 return; 15 end; 16 / SQL> spool c:\tmp\demo.dat SQL> select * from my_function(); SQL> spool off
So next time you’re cranking out some cumbersome UTL_FILE code to write a file, take a moment to see if the scheduler can look after some of the work for you. I’ll talk about (not) using UTL_FILE to read files in a future post.
I agree, but there is another consideration. Why are people using flat files at all? In many cases these should probably be replaced with web services, which are really easy to do using ORDS. 🙂
I understand some people will be doing large transfers for ETL type processes, and web services aren’t the best for that, but if I think back to most of the companies I’ve worked for and most of the file-based interactions were probably better served by something other than a file load.
Agreed. Even in the generic sense – if you went to all that effort to store and manage and exploit your data by putting in a database…why would you want to dump it back into flat files
Well, some sites need to interface data between non-RDBMS storage and databases.
Flat files are still the best, easiest and fastest method of doing that, particularly when complicated firewall setups are needed…
It’s a lot easier to just dump/read a data file on a utl_file directory that is then shared/sent/received by other specialist tools under closed security protocols that are not necessarily Oracle-specific.
Hello, Connor 🙂
Maybe due to this scenario or circumstance here how to unload table data to csv file fastest way for millions of records and
automate the generation of sql query output to csv.
I suppose that your next post could be about spool with parallel or pl/sql block with dbms_parallel_execute. 🙂
What is your suggested way to write a CLOB/BLOB to disk (on the server) without UTL_FILE?
DBMS_LOB has a LOADFROMFILE procedure, but no corresponding SAVETOFILE…
Currently using UTL_FILE as described here: https://oracle-base.com/articles/9i/export-blob-9i
DBMS_LOB.CLOB2FILE is there in recent releases 🙂
And what about BLOBs then?
And BTW, what’s up with the naming conventions here? There’s already LOADFROMFILE, LOADCLOBFROMFILE and LOADBLOBFROMFILE. Then someone decides to add a procedure to save a file, and it’s not called “SAVECLOBTOFILE” or something along those lines, no, it’s called “CLOB2FILE” (with a “2” instead of “TO” and no mention of “SAVE” which would be the natural counterpart to the existing procedures)… Sorry to be nitpicking, but you know, naming stuff properly is kinda important when it comes to programming…
Totally agree with Morten. UTL_FILE is alive because it has use cases which Connor didn’t mention. And storing binary files like images, videos and so on outside of the database is one of the most popular. I mean, you could store them as BLOBs in your database, but why? It increases the size of the databases dramatically, hence increases time of creation and checking the backups, size of the backups themselves and so on. And if we remember about XE (which is very limited by size), then UTL_FILE gets even more usability.
Actually, to my mind Connor just mention ‘dead approaches’, or, at least, outdated, to load and unload flat data in the files on the server. In that, I agree, but do not agree UTL_FILE is dead.
I also totally agree with Morten about naming conventions in DBMS_LOB. Seriously, LOADCLOBFROMFILE and LOADBLOBFROMFILE versus CLOB2FILE? Looks like the guy who implemented CLOB2FILE didn’t even look what subprograms are there already in the package. And yes, where’s the BLOB writing function after all?
With respect to: “but do not agree UTL_FILE is dead.” , well, of course it isn’t, but a blog post titled “UTL_FILE has other options” doesn’t really have the same theatrics 🙂
Just a quick heads-up. LOADFROMFILE is deprecated. It’s replaced by LOADCLOBFROMFILE and LOADBLOBFROMFILE. 🙂
Also on the topic of UTL_FILE, it has always been a big annoyance that this package lacks the ability to return a list of files for a given directory, as explained here: https://community.oracle.com/ideas/13420
UTL_FILE can read and write files, rename files, move files, check if files exist… but not give a directory listing! And it has been like that for 15-20 years… 😦
Like Tim, you could say “web service all the things”, but I think not having a complete set of file manipulation methods “natively” in PL/SQL is a weakness compared to most other programming languages (and something that should be really easy to fix by Oracle).
Agreed – but directory listing and the like are pretty much “solved” problems with external tables and scheduler scripts. But I concede – a consistent API would be nicer
Respectfully disagree that this is a “solved problem”… at best, there are kludges or workarounds, but the only proper solution would be to add it to the existing API to make it feature-complete. It’s not like this (directory listing) is some very cutting-edge technological problem that needs a few releases to “get right”, either… (as in, “here, use these workarounds with external tables or whatever until we have this stuff figured out…”). I know, there are always competing priorities, but this is quite basic stuff, and Oracle is a big company, and the rest of the API has been around for a very, very long time…
Sorry for the rant, I’m just complaining about this because I want to improve PL/SQL… ! 🙂
Bravo, Morten! Could not say it better 🙂
We are here, critisizing PL/SQL, because we care. Because we already convinced it is a great tool. But we are not the only ones who face this annoying stuff from time to time.
There is NO chance to stay at UTL_FILE as it is DESUPPORTED starting with database Version 18c!!!
That is NOT the case at all! The use of hard-coded path names (and hence utl_file_dir) are what is desupported. Here is my video on it https://www.youtube.com/watch?v=spjye4_SgOw
There exist some examples of obtaining directory listings using package DBMS_BACKUP_RESTORE,
which is also “not exactly meant for the common developer …”, this is why it is probably still undocumented,
even in the last versions.
An example can be found here: https://www.morganslibrary.org/hci/hci002.html
Once there was another example here:
but for some reason, this web page seems not to be available by now.
I definitely agree with all of you that an “open” API should be added to UTL_FILE for this purpose.
Thanks a lot & Best Regards,
You said you would write a post on “I’ll talk about (not) using UTL_FILE to read files in a future post.”
But I don’t see one. Can you please let me know how can I replace the UTL_FILE operation with schedulers?
Is your method running script on c\ drive from your local machine or on the dB server?
The file is written on the database server, so it is the same behaviour as UTL_FILE.
I agree with Denis, that the way UTL_FILE is being used looks more and more outdated. Why every time new interface with the vendor in the form of a flat file is requested – new development is needed. We have developed an API which requires simple configuration file with record layout
,DB View and fields formatting – just few lines. No coding is needed. This is much more flexible and easy. Looks more like a human task and not some trivial rudimentary effort.
I was using UTL_FILE to capture a log from a PL/SQL procedure which does quite a bit of DML. Therefore, I presumably cannot SELECT * FROM TABLE(MyProc()) – since DML would be happening during a select.
You could always add “pragma autonomous_transaction” to work around that, but this is perhaps pushing things a little far 🙂
While learning to use utl_file recently, I ran into a buffer limitation of 32k… That’s a “k”!!! In a 64-bit system!!! This amazed me. Maybe I’m off, but it is hard for me to believe that with 21st century, state-of-the-art Oracle products, we should expect to use code written with a 1990s mindset — at least, that’s how it feels in this case. 😉
32k is a line length limitation for ascii text files. Use fopen with ‘wb’ if you want to avoid that limitation
I agree UTL_FILE is dead but not because of the scheduler now allowing “sql_script” job types. I agree mostly because of Java. It is easy to find a good file i/o package written in Java, load it into the database using loadjava, and then create a package of Java Stored Procedures to do all file operations on the db server host machine such as reading and writing of ascii and binary files but also directory listings, directory renaming, file renames, file copies, file delete, etc (currently done with a scheduler job_type of “executable” and a shell script / batch file), especially now that Oracle DB JVM development has added some security controls such as “Secure Use of Runtime.exec Functionality in Oracle Database”
There are still reasons to use utl_file. There are many more use cases for practically everything, than one individual can think of.
I have one now: reading an alert log from an RDS database on Amazone AWS.
We know that utl_file is available, and can tailor it to do what we want.
Sure, we could do that with Java, but our usage would not be simplified by doing so.
external table not sufficient?
We need to be able to seek a position in the file, rather than read it all every time.
There is a view already defined, but it is sooo slowww. There is a long standing bug filed for that.
set pages 0
set lines 200
set trimspool on
think those 3 are redundant and have no effect when markup CSV is enabled (as per doc) ?
Thanks for the info!