More chances to bulk process

I’m sure most of us have read or heard at a conference the benefits of array fetching and array binding when it comes to passing data back and forth to the database. And we’ve all seen the numerous demo scripts in PL/SQL along the lines of:


FORALL i in 1 .. n 
   INSERT ...

As such, there is a misconception out there that you are only going to be able to use bulk binding for basic DML operations. So I thought I’d share this example that came in via AskTOM recently. We got asked if there was any means of improving the performance of this row-by-row operation where the DML was a complex Text index search, with the additional complication that on a row by row basis, the DML may fail but that this was an anticipated outcome that needed to be handled and moved past. The scenario presented was as follows:

  • A table SOURCE_DATA containing rows of free format text,
  • A table SEARCH_TERMS that would be populated by users, applications indicating a list of words/phrases that would searched for within SOURCE_DATA,
  • A table SEARCH_RESULTS that would contain a one row per search term to indicate the phrase was found in SOURCE_DATA.

To enable text searching, a text index is created on SOURCE DATA, otherwise the setup below is straightforward.


SQL> create table SOURCE_DATA
  2  ( id number,
  3    name varchar2(50) );

Table created.

SQL>
SQL> create sequence SOURCE_SEQ start with 1 increment by 1;

Sequence created.

SQL>
SQL> insert into SOURCE_DATA
  2  select SOURCE_SEQ.nextval,'abc'||rownum from dual
  3  connect by rownum<=10000;

10000 rows created.

SQL> insert into SOURCE_DATA
  2  select SOURCE_SEQ.nextval,'pqr'||rownum from dual
  3  connect by rownum<=10000;

10000 rows created.

SQL> insert into SOURCE_DATA
  2  select SOURCE_SEQ.nextval,'xyz'||rownum from dual
  3  connect by rownum<=10000;

10000 rows created.

SQL> insert into SOURCE_DATA
  2  select SOURCE_SEQ.nextval,'name'||rownum from dual
  3  connect by rownum<=50000;

50000 rows created.

SQL> commit;

Commit complete.

SQL>
SQL> exec ctx_ddl.create_stoplist('keep_all_stopwords', 'BASIC_STOPLIST');

PL/SQL procedure successfully completed.

SQL>
SQL> create index SOURCE_IDX_TEXT on SOURCE_DATA(name)
  2  indextype is ctxsys.context
  3  parameters ('stoplist keep_all_stopwords sync (on commit)');

Index created.

SQL>
SQL> create table SEARCH_TERMS
  2  ( search_name varchar2(50) );

Table created.

SQL>
SQL> create table SEARCH_RESULTS
  2  ( search_name varchar2(50) );

Table created.

SQL>
SQL> insert into SEARCH_TERMS values ('xyz1');

1 row created.

SQL> insert into SEARCH_TERMS values ('xyz10000');

1 row created.

SQL> insert into SEARCH_TERMS values ('n');

1 row created.

SQL> commit;

Commit complete.

With the data above, the intent here is to do a wildcard text search in SOURCE_DATA for the value “xyz1”, and then a wildcard text search for “xyz10000” and so forth for each row in SEARCH_TERMS. Here is the first cut of the code provided by the poster on AskTOM. We loop around for each entry in SEARCH_TERMS and perform an INSERT-WHERE-EXISTS query. However, because this is a wild-card search, then it is possible for errors to be returned from a Text query, which necessitates the exception handler in the code. We can see how this can manifests itself with a sample run.


SQL>
SQL> create or replace
  2  procedure testInsert as
  3    v_errcode NUMBER;
  4    v_errm VARCHAR2(200);
  5  begin
  6    for a in ( select * from SEARCH_TERMS )
  7    loop
  8      dbms_output.put_line('Loading-' || a.search_name);
  9      begin
 10       insert into SEARCH_RESULTS
 11       select a.search_name
 12       from dual
 13       where  (
 14         select count(*)
 15         from SOURCE_DATA b
 16         where contains(b.name, '%' || a.search_name || '%') > 0
 17         and rownum <= 2 ) = 1;
 18
 19     exception
 20       when others then
 21          v_errcode := sqlcode;
 22          v_errm := substr(sqlerrm, 1, 200);
 23          dbms_output.put_line('Error code ' || v_errcode || ': ' || v_errm);
 24     end;
 25    end loop;
 26
 27  end;
 28  /

Procedure created.

SQL>
SQL> set serverout on
SQL> exec testInsert
Loading-xyz1
Loading-xyz10000
Loading-n
Error code -29902: ORA-29902: error in executing ODCIIndexStart() routine
ORA-20000: Oracle Text error:
DRG-51030: wildcard query expansion resulted in too many terms

PL/SQL procedure successfully completed.

For the first two search terms, the check works fine, but for the search term of “n”, it is deemed “too vague” by Text engine and returns the error “DRG-51030: wildcard query expansion resulted in too many terms”. But since this is an expected error (since search phrases come from an arbitrary source) we catch the error and move on to the next phrase. The poster on AskTOM was looking for a mechanism to speed this up, since once there was a large number of search phrases, the row-by-row approach became the familiar cliché “slow-by-slow”.

But even with a scenario like this, array processing via bulk binding can be utilised. I’ve recoded the example to use bulk binding. Even though we have a complex SQL with a Text query, along with the need for an error handler, we can still take advantage of array processing. Using the SQL%BULK_EXCEPTIONS structure, we still get access to rows in error.


SQL>
SQL> create or replace
  2  procedure testInsert as
  3    v_errcode number;
  4    v_errm varchar2(200);
  5
  6    type row_list is table of varchar2(100) index by pls_integer;
  7    r row_list;
  8
  9    bulk_failed exception;
 10    pragma exception_init(bulk_failed, -24381);
 11
 12    l_cnt int;
 13  begin
 14    select search_name bulk collect into r from SEARCH_TERMS;
 15
 16    forall i in 1 .. r.count save exceptions
 17       insert into SEARCH_RESULTS
 18       select r(i)
 19       from dual
 20       where  (
 21         select count(*)
 22         from SOURCE_DATA b
 23         where contains(b.name, '%' || r(i) || '%') > 0
 24         and rownum <= 2 ) = 1;
 25
 26  exception
 27   when bulk_failed then
 28      l_cnt := sql%bulk_exceptions.count;
 29      for i in 1 .. l_cnt loop
 30        dbms_output.put_line('failed: ' || i || ' value: ' ||
 31            r(sql%bulk_exceptions(i).error_index) ||' : ' ||
 32            sqlerrm(-sql%bulk_exceptions(i).error_code));
 33      end loop;
 34  end;
 35  /

Procedure created.

SQL>
SQL> set serverout on
SQL> exec testInsert
failed: 1 value: n : ORA-29902: error in executing ODCIIndexStart() routine

PL/SQL procedure successfully completed.

SQL>

So don’t be too quick to dismiss the opportunities to use bulk binding in your applications. If you can code the SQL in PL/SQL, you can probably (re)code to use array processing.

Got some thoughts? Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: