More chances to bulk process

I’m sure most of us have read or heard at a conference the benefits of array fetching and array binding when it comes to passing data back and forth to the database. And we’ve all seen the numerous demo scripts in PL/SQL along the lines of:


FORALL i in 1 .. n 
   INSERT ...

As such, there is a misconception out there that you are only going to be able to use bulk binding for basic DML operations. So I thought I’d share this example that came in via AskTOM recently. We got asked if there was any means of improving the performance of this row-by-row operation where the DML was a complex Text index search, with the additional complication that on a row by row basis, the DML may fail but that this was an anticipated outcome that needed to be handled and moved past. The scenario presented was as follows:

  • A table SOURCE_DATA containing rows of free format text,
  • A table SEARCH_TERMS that would be populated by users, applications indicating a list of words/phrases that would searched for within SOURCE_DATA,
  • A table SEARCH_RESULTS that would contain a one row per search term to indicate the phrase was found in SOURCE_DATA.

To enable text searching, a text index is created on SOURCE DATA, otherwise the setup below is straightforward.


SQL> create table SOURCE_DATA
  2  ( id number,
  3    name varchar2(50) );

Table created.

SQL>
SQL> create sequence SOURCE_SEQ start with 1 increment by 1;

Sequence created.

SQL>
SQL> insert into SOURCE_DATA
  2  select SOURCE_SEQ.nextval,'abc'||rownum from dual
  3  connect by rownum<=10000;

10000 rows created.

SQL> insert into SOURCE_DATA
  2  select SOURCE_SEQ.nextval,'pqr'||rownum from dual
  3  connect by rownum<=10000;

10000 rows created.

SQL> insert into SOURCE_DATA
  2  select SOURCE_SEQ.nextval,'xyz'||rownum from dual
  3  connect by rownum<=10000;

10000 rows created.

SQL> insert into SOURCE_DATA
  2  select SOURCE_SEQ.nextval,'name'||rownum from dual
  3  connect by rownum<=50000;

50000 rows created.

SQL> commit;

Commit complete.

SQL>
SQL> exec ctx_ddl.create_stoplist('keep_all_stopwords', 'BASIC_STOPLIST');

PL/SQL procedure successfully completed.

SQL>
SQL> create index SOURCE_IDX_TEXT on SOURCE_DATA(name)
  2  indextype is ctxsys.context
  3  parameters ('stoplist keep_all_stopwords sync (on commit)');

Index created.

SQL>
SQL> create table SEARCH_TERMS
  2  ( search_name varchar2(50) );

Table created.

SQL>
SQL> create table SEARCH_RESULTS
  2  ( search_name varchar2(50) );

Table created.

SQL>
SQL> insert into SEARCH_TERMS values ('xyz1');

1 row created.

SQL> insert into SEARCH_TERMS values ('xyz10000');

1 row created.

SQL> insert into SEARCH_TERMS values ('n');

1 row created.

SQL> commit;

Commit complete.

With the data above, the intent here is to do a wildcard text search in SOURCE_DATA for the value “xyz1”, and then a wildcard text search for “xyz10000” and so forth for each row in SEARCH_TERMS. Here is the first cut of the code provided by the poster on AskTOM. We loop around for each entry in SEARCH_TERMS and perform an INSERT-WHERE-EXISTS query. However, because this is a wild-card search, then it is possible for errors to be returned from a Text query, which necessitates the exception handler in the code. We can see how this can manifests itself with a sample run.


SQL>
SQL> create or replace
  2  procedure testInsert as
  3    v_errcode NUMBER;
  4    v_errm VARCHAR2(200);
  5  begin
  6    for a in ( select * from SEARCH_TERMS )
  7    loop
  8      dbms_output.put_line('Loading-' || a.search_name);
  9      begin
 10       insert into SEARCH_RESULTS
 11       select a.search_name
 12       from dual
 13       where  (
 14         select count(*)
 15         from SOURCE_DATA b
 16         where contains(b.name, '%' || a.search_name || '%') > 0
 17         and rownum <= 2 ) = 1;
 18
 19     exception
 20       when others then
 21          v_errcode := sqlcode;
 22          v_errm := substr(sqlerrm, 1, 200);
 23          dbms_output.put_line('Error code ' || v_errcode || ': ' || v_errm);
 24     end;
 25    end loop;
 26
 27  end;
 28  /

Procedure created.

SQL>
SQL> set serverout on
SQL> exec testInsert
Loading-xyz1
Loading-xyz10000
Loading-n
Error code -29902: ORA-29902: error in executing ODCIIndexStart() routine
ORA-20000: Oracle Text error:
DRG-51030: wildcard query expansion resulted in too many terms

PL/SQL procedure successfully completed.

For the first two search terms, the check works fine, but for the search term of “n”, it is deemed “too vague” by Text engine and returns the error “DRG-51030: wildcard query expansion resulted in too many terms”. But since this is an expected error (since search phrases come from an arbitrary source) we catch the error and move on to the next phrase. The poster on AskTOM was looking for a mechanism to speed this up, since once there was a large number of search phrases, the row-by-row approach became the familiar cliché “slow-by-slow”.

But even with a scenario like this, array processing via bulk binding can be utilised. I’ve recoded the example to use bulk binding. Even though we have a complex SQL with a Text query, along with the need for an error handler, we can still take advantage of array processing. Using the SQL%BULK_EXCEPTIONS structure, we still get access to rows in error.


SQL>
SQL> create or replace
  2  procedure testInsert as
  3    v_errcode number;
  4    v_errm varchar2(200);
  5
  6    type row_list is table of varchar2(100) index by pls_integer;
  7    r row_list;
  8
  9    bulk_failed exception;
 10    pragma exception_init(bulk_failed, -24381);
 11
 12    l_cnt int;
 13  begin
 14    select search_name bulk collect into r from SEARCH_TERMS;
 15
 16    forall i in 1 .. r.count save exceptions
 17       insert into SEARCH_RESULTS
 18       select r(i)
 19       from dual
 20       where  (
 21         select count(*)
 22         from SOURCE_DATA b
 23         where contains(b.name, '%' || r(i) || '%') > 0
 24         and rownum <= 2 ) = 1;
 25
 26  exception
 27   when bulk_failed then
 28      l_cnt := sql%bulk_exceptions.count;
 29      for i in 1 .. l_cnt loop
 30        dbms_output.put_line('failed: ' || i || ' value: ' ||
 31            r(sql%bulk_exceptions(i).error_index) ||' : ' ||
 32            sqlerrm(-sql%bulk_exceptions(i).error_code));
 33      end loop;
 34  end;
 35  /

Procedure created.

SQL>
SQL> set serverout on
SQL> exec testInsert
failed: 1 value: n : ORA-29902: error in executing ODCIIndexStart() routine

PL/SQL procedure successfully completed.

SQL>

So don’t be too quick to dismiss the opportunities to use bulk binding in your applications. If you can code the SQL in PL/SQL, you can probably (re)code to use array processing.

Determined on Determinism

I’m feeling very determined on this one. Yes I have a lot of determination to inform blog readers about determinism, and yes I have run out of words that sound like DETERMINISTIC. But one of the most common misconceptions I see for PL/SQL functions is that developers treat them as if they were “extending” the existing database kernel. By this I mean that developers often assume that wherever an existing in-built function (for example TO_NUMBER or SUBSTR etc) could be used, then a PL/SQL function of their own creation will work in the exactly the same way.

Often that will be the case, but the most common scenario I see tripping up people is using PL/SQL functions within SQL statements. Consider the following simple example, where a PL/SQL function is utilizing the in-built SYSTIMESTAMP and TO_CHAR functions.


SQL> create or replace
  2  function f(i varchar2) return varchar2 is
  3  begin
  4    return i||'-'||to_char(systimestamp,'HH24MISS:FF');
  5    --dbms_lock.sleep(0.5);
  6  end;
  7  /

Function created.

Let us compare the output from the function when used within a SQL statement, with the results from same built-in functions used directly from the SQL statement.


SQL> select rownum, to_char(systimestamp,'HH24MISS:FF') x1, f(rownum) x2
  2  from   dual
  3  connect by level <= 9;

    ROWNUM X1                             X2
---------- ------------------------------ ------------------------------
         1 181557:351000                  1-181557:351000000
         2 181557:351000                  2-181557:361000000
         3 181557:351000                  3-181557:361000000
         4 181557:351000                  4-181557:364000000
         5 181557:351000                  5-181557:364000000
         6 181557:351000                  6-181557:366000000
         7 181557:351000                  7-181557:366000000
         8 181557:351000                  8-181557:372000000
         9 181557:351000                  9-181557:372000000

9 rows selected.

A direct call to SYSTIMESTAMP is fixed for the duration of the execution of a SQL statement, but this is NOT the case for the SYSTIMESTAMP call made within the PL/SQL function. The PL/SQL function is being called multiple times during the single execution of the SQL statement, and hence each execution is totally entitled to return a “fresh” result from SYSTIMESTAMP.

Moreover, the database makes no guarantees that a PL/SQL function will be called once per row encountered in a SQL statement, so if your PL/SQL function changes session state in some way (for example, a package variable) then you can never assume that there will be a 1-to-1 relationship between rows processed and PL/SQL function executions.

The only way to be sure that you won’t get unexpected results from PL/SQL function calls within SQL is for those functions to be deterministic, and responsibility for that lies entirely with the developer not with the database. So please don’t think that the solution to this is just throwing in the DETERMINISTIC keyword. You need to inspect your code and ensure you won’t get spurious results from that PL/SQL function when used from SQL.

Long running scheduler jobs

One of the nice things about the job scheduler in the Oracle database is the easily interpreted interval settings you can apply for job frequency. The days of cryptic strings like “sysdate+0.000694444” when all you really wanted to say was “Just run this job every minute” are a thing of the past. I covered how to get the database to convert interval strings into real execution dates here 

But it raises the question: What if I have a job that is scheduled to run every minute, but it takes more than 1 minute to run? Will the scheduler just crank out more and more concurrent executions of that job? Will I swamp my system with ever more background jobs? So I thought I’d find out with a simple test.

I created a table which will record the start and end time for executions of a procedure, and then crafted that procedure to always run for at least 2 minutes using dbms_lock.sleep. (If you are on 18c, you can replace this with dbms_session.sleep to avoid the need for an explicit grant.) Then I set this procedure to be run every minute via dbms_scheduler.


SQL> create table t ( tag varchar2(10), d date);

Table created.

SQL>
SQL> create or replace
  2  procedure P is
  3  begin
  4    insert into t values ('start',sysdate);
  5    commit;
  6    dbms_lock.sleep(120);
  7    insert into t values ('end',sysdate);
  8    commit;
  9  end;
 10  /

Procedure created.


SQL> begin
  2    dbms_scheduler.create_job (
  3      job_name        => 'JOB1',
  4      job_type        => 'PLSQL_BLOCK',
  5      job_action      => 'begin p; end;',
  6      start_date      => systimestamp,
  7      repeat_interval => 'freq=minutely;bysecond=0;',
  8      enabled         => true);
  9  end;
 10  /

PL/SQL procedure successfully completed.

I waited 20 minutes and then looked at both my table and the scheduler logs to see how many concurrent executions were recorded.


SQL> select * from t order by d, tag;

TAG        D
---------- -------------------
start      25/03/2019 21:36:00
end        25/03/2019 21:38:00
start      25/03/2019 21:38:00
end        25/03/2019 21:40:00
start      25/03/2019 21:40:00
end        25/03/2019 21:42:00
start      25/03/2019 21:42:01
end        25/03/2019 21:44:01
start      25/03/2019 21:44:01
end        25/03/2019 21:46:01
start      25/03/2019 21:46:01
end        25/03/2019 21:48:01
start      25/03/2019 21:48:01
end        25/03/2019 21:50:01
start      25/03/2019 21:50:01
end        25/03/2019 21:52:01
start      25/03/2019 21:52:01
end        25/03/2019 21:54:01

18 rows selected.

SQL> select log_date
  2  from   dba_scheduler_job_log
  3  where job_name = 'JOB1'
  4  order by log_date;

LOG_DATE
-----------------------------------------------
25-MAR-19 09.38.00.866000 PM +08:00
25-MAR-19 09.40.00.920000 PM +08:00
25-MAR-19 09.42.00.998000 PM +08:00
25-MAR-19 09.44.01.037000 PM +08:00
25-MAR-19 09.46.01.078000 PM +08:00
25-MAR-19 09.48.01.143000 PM +08:00
25-MAR-19 09.50.01.171000 PM +08:00
25-MAR-19 09.52.01.206000 PM +08:00
25-MAR-19 09.54.01.272000 PM +08:00

9 rows selected.

As you can see, the scheduler is not going to swamp your system. It will not run the “next” occurrence of your submitted job until the current execution has completed. So even though we requested an execution each minute, we are bound by the run time duration of the job itself. Once additional nice thing is that once the job has finished, the scheduler immediately sees that the next execution is overdue and launches the next job straight away. So no need to worry about an excessive number of jobs all running together.

Just as an aside, when you have a scheduler job that is “always” in a running state, then you need to take care when dropping the job because by default you cannot drop a running job. In such instances, you can always add the FORCE parameter to kill the current execution and remove the job from the scheduler.


SQL> exec dbms_scheduler.drop_job('JOB1')
BEGIN dbms_scheduler.drop_job('JOB1'); END;

*
ERROR at line 1:
ORA-27478: job "MCDONAC"."JOB1" is running
ORA-06512: at "SYS.DBMS_ISCHED", line 274
ORA-06512: at "SYS.DBMS_SCHEDULER", line 753
ORA-06512: at line 1


SQL> exec dbms_scheduler.drop_job('JOB1',force=>true)

PL/SQL procedure successfully completed.

 

Oracle Magazine

Generally my blog is just snippets of tech content that take my interest as I encounter them (most commonly when looking at AskTOM). If I think they’ll be useful, I’ll just plonk them out right there and then. If you prefer your content in longer (and more structured Smile) form, then also I publish longer form articles on Oracle Magazine every couple of months. Below is a consolidated list of my articles. I’ll try to keep this list updated as I add new ones.

Old Dog, New Tricks, Part 2
Here’s a new SQL syntax for hierarchy processing.

Improved Defaults in Oracle Database 12c
The new DEFAULT clause provides better values for getting started and better performance.

Excellent Extensions
New features in Oracle Database 12c Release 2 make external tables even more flexible.

Tighter PL/SQL and SQL Integration
PL/SQL functions perform better in SQL in Oracle Database 12c.

All Aboard the SQL*Loader Express
A new processing mode takes the hassle out of dealing with flat file loading.

Long and Overflowing
LISTAGG in Oracle Database 12c Release 2 solves the problem of excessively long lists.

Assume the Best; Plan for the Worst
Here’s a technique for delivering better performance through optimistic code programming.

A Fresh Look at Auditing Row Changes
Triggers can provide auditing information, but there’s a future in flashback.

Better Tools for Better Data
New functions in Oracle Database 12c Release 2 solve data validation challenges.

Unintended Side Effects
Ensure that the code you write does not create problems elsewhere in your applications.

Write in a Read-Only Database
Run reports and DML against your standby database with Oracle Active Data Guard.

Open for Exchange
FOR EXCHANGE in Oracle Database 12c Release 2 takes the detective work out of partition exchanges.

Are We All on the Same Page?
Pagination of data can make (or break) your database.

Old Dog, New Tricks
Take advantage of SQL extensions for hierarchy processing.

A Higher-Level Perspective on SQL Tuning
The commonly missed first steps of tuning a SQL statement

A Higher-Level Perspective on SQL Tuning part 2
Next in the series on SQL tuning, how to find problematic SQL

External table preprocessor on Windows

There are plenty of blog posts about using the pre-processor facility in external tables to get OS level information available from inside the database. Here’s a simple example of getting a directory listing:


run_os.bat
==========
@echo off
cd \oracle
dir /b

SQL> create table fs_size (
  2   disk varchar2(64)
  3  )
  4  organization external
  5  (  type oracle_loader
  6     default directory temp
  7     access parameters
  8     ( records delimited by newline
  9       preprocessor  temp:'run_os.bat'
 10     )
 11     location ( temp:'empty.txt' )
 12  ) reject limit unlimited ;

Table created.

SQL>
SQL> select * from fs_size;

DISK
------------------------------------------------
18c_cloud
admin
adw_cloud
atp_cloud
audit
cfgtoollogs
checkpoints
datamodeler
dbsat
diag
instantclient
jdbc183
ords.184
ords122
ords181
ords184
ordsconf
product
sql
sqlcl
sqldeveloper
swingbench
wallet

So far so good. But on Windows, it can be quite particular about the runtime environment. For example, if I fire up a command prompt, as a logged in user I can easily get a listing of disk partitions on my machine:


C:\>wmic logicaldisk get caption
Caption
C:
D:
E:
F:
G:
M:
P:
X:
Z:

It would seem logical that this would be a simple drop-in replacement for the directory listing batch file I used before. I’ll amend the batch file to list out disk partitions


run_os.bat
==========
@echo off
wmic logicaldisk get caption

and now I’ll run my query again.


SQL> select * from fs_size;

no rows selected

Hmmm….that didn’t go as well as planned Smile. Looking in the log file, there are no clues about the cause.


 LOG file opened at 03/06/19 14:33:42

KUP-05007:   Warning: Intra source concurrency disabled because the preprocessor option is being used.

Field Definitions for table FS_SIZE
  Record format DELIMITED BY NEWLINE
  Data in file has same endianness as the platform
  Rows with all null fields are accepted

  Fields in Data Source: 

    DISK                            CHAR (255)
      Terminated by ","
      Trim whitespace same as SQL Loader

The issue here is that many Windows programs require a minimum environment configuration before they will run. In this case, I need to let Windows know the location of the OS installation.


run_os.bat
==========
@echo off
set SystemRoot=C:\WINDOWS
wmic logicaldisk get caption

And voila! I get access to many more Windows command to probe out information from the Operating System.


SQL> select * from fs_size;

DISK
--------------------------------
Caption
C:
D:
E:
F:
G:
M:
P:
X:
Z:

Connections with a wallet – redux

Wow…it is nearly 4 years ago now that I wrote an article on connecting to the database via a wallet to avoid having to hard code passwords into script. That article is here:

https://connor-mcdonald.com/2015/09/21/connection-shortcuts-with-a-wallet/

So I went to do a similar exercise on my new 18c Windows database today, and to my surprise things went pear shaped at the very first step


c:\oracle\product\18\bin>mkstore -create -wrl c:\temp\wallet
The syntax of the command is incorrect.

Not a lot of feedback there Smile

Unfortunately it’s just a small error in the script (See MOS Note 2406404.1 ). All you need to do copy the mkstore.bat from as 12.1 client and you’ll be fine. I called mine mkstore121.bat.


c:\oracle\product\18\bin>mkstore121.bat -create -wrl c:\temp\wallet
Oracle Secret Store Tool Release 18.0.0.0.0 - Production
Version 18.1.0.0.0
Copyright (c) 2004, 2017, Oracle and/or its affiliates. All rights reserved.

Enter password: *********
Enter password again: *********

Disabled EZCONNECT

Just a normal start to the day today…I had my coffee

coffee_gif

and then started working on some AskTOM questions. Naturally pretty much the first thing I needed to do is connect to my database, and then this happened:


C:\oracle\product\18\bin>sqlplus scott/tiger@//gtx:1518/pdb1

SQL*Plus: Release 18.0.0.0.0 - Production on Wed Mar 6 09:23:09 2019
Version 18.5.0.0.0

Copyright (c) 1982, 2018, Oracle.  All rights reserved.

ERROR:
ORA-12154: TNS:could not resolve the connect identifier specified

I made the standard assumptions

  • I had set ORACLE_HOME wrong….checked that, nope!
  • I had my TNS_ADMIN set wrong….nope!
  • Listener not configured corrrectly….nope!
  • Database not started or not registered with listener….nope!
  • Had my good friends at Microsoft restarted the PC with Windows Update without telling me?…nope!

so once I’d ticked off the normal culprits, I then tried with other software installations on the same machine


C:\oracle\instantclient>sqlplus scott/tiger@//gtx:1518/pdb1

SQL*Plus: Release 12.1.0.2.0 Production on Wed Mar 6 09:21:44 2019

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Wed Mar 06 2019 09:13:31 +08:00

Connected to:
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production

C:\oracle\product\xe\18\dbhomeXE\bin>sqlplus scott/tiger@//gtx:1518/pdb1

SQL*Plus: Release 18.0.0.0.0 - Production on Wed Mar 6 09:22:38 2019
Version 18.4.0.0.0

Copyright (c) 1982, 2018, Oracle.  All rights reserved.

Last Successful login time: Wed Mar 06 2019 09:21:44 +08:00

Connected to:
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Version 18.5.0.0.0

and voila! Obviously everything is OK from the listener and database perspective.

So back to the original installation, and I had the light bulb moment. I had been tinkering with sqlnet.ora to help out with a different question yesterday, and I had isolated the connection options down to only TNSNAMES


NAMES.DIRECTORY_PATH= (TNSNAMES)

In such a scenario, you need to explicitly allow for EZCONNECT method to allow the connection to work.


NAMES.DIRECTORY_PATH= (TNSNAMES, EZCONNECT)

and then everything was fine back in the original installation directory.


C:\oracle\product\18\bin>sqlplus scott/tiger@//gtx:1518/pdb1

SQL*Plus: Release 18.0.0.0.0 - Production on Wed Mar 6 09:50:52 2019
Version 18.5.0.0.0

Copyright (c) 1982, 2018, Oracle.  All rights reserved.

Last Successful login time: Wed Mar 06 2019 09:45:06 +08:00

Connected to:
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Version 18.5.0.0.0

If you have never touched your sqlnet.ora, it probably looks like this:


NAMES.DIRECTORY_PATH= (TNSNAMES, ONAMES, HOSTNAME)

but there is no need to panic. Those default values allow EZCONNECT as well.