Faster operations with CLOBs

Posted by

Somewhere around version 9 (maybe before, maybe later…all so long ago 😀) the distinction between CLOB and VARCHAR2 became less defined. This was a good thing because the typical operations you would do with a VARCHAR2, for example, SUBSTR, REPLACE, INSTR and the like, could be used directly against a CLOB datatype without needing to resort to DBMS_LOB utilities. We all like simpler code when building our apps.

However, convenience sometimes comes with a price. Let’s say I have a large existing CLOB and I want to append some data to it as that data arrives from an external source. I can mimic that with the following code snippet

  • load a large CLOB from the filesystem
  • query that CLOB into a variable
  • loop 100 times, each time appending 100 characters to the CLOB

In simplest form, we would expect the code to look as simple as:


clob_var := clob_var || 100char_string;

But let’s explore various permutations of coding options and do some timing tests on each. The results below may surprise you.


SQL> create table t ( c clob );

Table created.

SQL> insert into t values ( empty_clob() );

1 row created.

SQL>
SQL> /* first we load our initial 100m lob */
SQL> 
SQL> declare
  2    l_bf   bfile;
  3    l_cl   clob;
  4
  5    l_dst  int := 1;
  6    l_src  int := 1;
  7    l_cs   int  := 0;
  8    l_lg   int := 0;
  9    l_wrn  int := 0;
 10  begin
 11    select c into l_cl from t for update;
 12
 13    l_bf := bfilename('TEMP', 'bigfile.txt');
 14    dbms_lob.fileopen(l_bf, dbms_lob.file_readonly);
 15    dbms_lob.loadclobfromfile (
 16      dest_lob      => l_cl,
 17      src_bfile     => l_bf,
 18      amount        => dbms_lob.lobmaxsize,
 19      dest_offset   => l_dst,
 20      src_offset    => l_src,
 21      bfile_csid    => l_cs ,
 22      lang_context  => l_lg,
 23      warning       => l_wrn);
 24    dbms_lob.fileclose(l_bf);
 25
 26    commit;
 27  end;
 28  /

PL/SQL procedure successfully completed.

SQL> select dbms_lob.getlength(c) from t;

DBMS_LOB.GETLENGTH(C)
---------------------
            115773630

SQL> set timing on
SQL>

SQL> --  Commented out because it takes a LOONNNNGGG time

SQL> --declare
SQL> --  l_txt varchar2(100) := rpad('x',100,'x');
SQL> --begin
SQL> --  for i in 1 .. 100 loop
SQL> --    update t
SQL> --    set c = c || l_txt;
SQL> --  end loop;
SQL> --  commit;
SQL> --end;
SQL> --/
SQL>
SQL>
SQL> -- now less update calls by deferring append via batching SQL> declare 2 l_txt varchar2(100) := rpad('x',100,'x'); 3 l_big varchar2(32000); 4 begin 5 for i in 1 .. 100 loop 6 l_big := l_big || l_txt; -- batching up strings 7 end loop; 8 9 update t 10 set c = c || l_big; -- but still using || in DML 11 commit; 12 end; 13 / PL/SQL procedure successfully completed. Elapsed: 00:00:11.35 SQL> declare 2 l_txt varchar2(100) := rpad('x',100,'x'); 3 l_big varchar2(32000); 4 l_cl clob; 5 begin 6 dbms_lob.createtemporary(l_cl,true); -- using a temp lob better BUT (see update) 7 select c into l_cl from t; 8 9 for i in 1 .. 100 loop 10 l_cl := l_cl || l_txt; -- || in code better than DML but still slow 11 end loop; 12 13 update t -- you pay a price to put that temp lob back in 14 set c = l_cl; 15 commit; 16 end; 17 / PL/SQL procedure successfully completed. Elapsed: 00:00:07.20 SQL> declare 2 l_txt varchar2(100) := rpad('x',100,'x'); 3 l_cl clob; 4 begin 5 select c into l_cl from t for update; 6 7 for i in 1 .. 100 loop 8 l_cl := l_cl || l_txt; -- getting better now 9 end loop; 10 commit; 11 end; 12 / PL/SQL procedure successfully completed. Elapsed: 00:00:01.18 SQL> declare 2 l_txt varchar2(100) := rpad('x',100,'x'); 3 l_big varchar2(32000); 4 l_cl clob; 5 begin 6 select c into l_cl from t for update; 7 8 for i in 1 .. 100 loop 9 dbms_lob.append(l_cl,l_txt); -- direct append to the clob 10 end loop; 11 commit; 12 end; 13 / PL/SQL procedure successfully completed. Elapsed: 00:00:00.04 SQL> declare 2 l_txt varchar2(100) := rpad('x',100,'x'); 3 l_big varchar2(32000); 4 l_cl clob; 5 begin 6 for i in 1 .. 100 loop -- and finally direct append with batching 7 l_big := l_big || l_txt; 8 end loop; 9 10 select c into l_cl from t for update; 11 12 dbms_lob.append(l_cl,l_big); 13 commit; 14 end; 15 / PL/SQL procedure successfully completed. Elapsed: 00:00:00.01 SQL>

So if you’re working with LOBs and basic operations seem to be slower than you expect, take a look at the code you’re using – there may well be a faster option.

2 comments

  1. It’s great that you’ve shown us so many different methods to try.
    The question I have is why? What makes method X faster than Y ? Is it CPU utilisation? is it writing to temp? Is it caching? Is it the length of characters you’re appending? Is it Oracle bug? Etc.
    If I had to explain why I choose a method, then I would have to say I have absolutely no idea why, i just tried all all methods suggested by Connor and this one came out top!!

    1. Surely “suggested by Connor ” is sufficient? 🙂

      When chatting internally with some people about this, ie, people who have access to the code, most of the time it comes down to assumptions that the code makes in terms of “typical use”. Some operations are optimised for smaller clobs, some for larger ones, some for limited number of clob modifcations etc. The aim is – make it fast for the most typical use case, but that can come at the expense of outliers.

Got some thoughts? Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.