PL/SQL – Don’t mix and match scope

Here’s a simple little PL/SQL block where we call an inner procedure PARAMETER_TESTER from its parent block. Pay particular attention to the parameter we pass to the procedure, and it’s value throughout the execution of that procedure.


SQL> set serverout on
SQL> declare
  2
  3     glob_var  int := 0;
  4     local_var int;
  5
  6     procedure PARAMETER_TESTER(param int) is
  7     begin
  8         dbms_output.put_line('Param came in as: '||param);
  9         glob_var := glob_var + 1;
 10         dbms_output.put_line('Param left as   : '||param);
 11         dbms_output.put_line('glob_var is now : '||glob_var);
 12     end;
 13
 14  begin
 15     parameter_tester(100);
 16  end;
 17  /
Param came in as: 100
Param left as   : 100
glob_var is now : 1

PL/SQL procedure successfully completed.

Now I’ll slowly extend the code, and just by eyeballing it, see if you can predict what the output will be before looking past the end of the PL/SQL block.



SQL> set serverout on
SQL> declare
  2
  3     glob_var  int := 0;
  4     local_var int;
  5
  6     procedure PARAMETER_TESTER(param int) is
  7     begin
  8         dbms_output.put_line('Param came in as: '||param);
  9         glob_var := glob_var + 1;
 10         dbms_output.put_line('Param left as   : '||param);
 11         dbms_output.put_line('glob_var is now : '||glob_var);
 12     end;
 13
 14  begin
 15     local_var := glob_var;
 16     parameter_tester(local_var);
 17  end;
 18  /
Param came in as: 0
Param left as   : 0
glob_var is now : 1

PL/SQL procedure successfully completed.

So far so good I imagine. The parameter came in as zero, we incremented the global variable which of course had no impact on the parameter. Let’s now up the ante a little.



SQL> set serverout on
SQL> declare
  2
  3     glob_var  int := 0;
  4     local_var int;
  5
  6     procedure PARAMETER_TESTER(param int) is
  7     begin
  8         dbms_output.put_line('Param came in as: '||param);
  9         glob_var := glob_var + 1;
 10         dbms_output.put_line('Param left as   : '||param);
 11         dbms_output.put_line('glob_var is now : '||glob_var);
 12     end;
 13
 14  begin
 15     parameter_tester(glob_var);
 16  end;
 17  /
Param came in as: 0
Param left as   : 1
glob_var is now : 1

PL/SQL procedure successfully completed.

This is perhaps the first one that you might find a little unexpected. Notice that the value of the parameter passed to the procedure has changed within the inner procedure even though it was passed (implicitly) as an IN parameter. People often assume that if you pass anything to a procedure without the IN OUT or OUT specification, then the parameter is “read only” and cannot be touched by code. This is true to the extent that you cannot perform an assignment to that parameter as you can see below



SQL> set serverout on
SQL> declare
  2
  3     glob_var  int := 0;
  4     local_var int;
  5
  6     procedure PARAMETER_TESTER(param int) is
  7     begin
  8         param := param + 1;
  9     end;
 10
 11  begin
 12     parameter_tester(glob_var);
 13  end;
 14  /
       param := param + 1;
       *
ERROR at line 8:
ORA-06550: line 8, column 8:
PLS-00363: expression 'PARAM' cannot be used as an assignment target
ORA-06550: line 8, column 8:
PL/SQL: Statement ignored

but that is not the same as saying that the parameter is fixed in value throughout the duration of the call. This behaviour is documented in the PL/SQL language manual in that an IN parameter can be passed by reference rather than a static value.

image

Now I’ll explore some other examples of how you might get caught out by this. I’ll modify the example just slightly now so that I’m passing an expression rather than just “glob_var”.



SQL> set serverout on
SQL> declare
  2
  3     glob_var  int := 0;
  4     local_var int;
  5
  6     procedure PARAMETER_TESTER(param int) is
  7     begin
  8         dbms_output.put_line('Param came in as: '||param);
  9         glob_var := glob_var + 1;
 10         dbms_output.put_line('Param left as   : '||param);
 11         dbms_output.put_line('glob_var is now : '||glob_var);
 12     end;
 13
 14  begin
 15     parameter_tester(glob_var+1);
 16  end;
 17  /
Param came in as: 1
Param left as   : 1
glob_var is now : 1

PL/SQL procedure successfully completed.

Now “normal” service has been resumed, in that the expression is evaluated first and hence is passed by value to the procedure leaving the parameter value unchanged throughout the procedure.

So it would appear an expression will disable the “pass by reference” mechanism? Well, let’s try two more examples



SQL> set serverout on
SQL> declare
  2
  3     glob_var  int := 0;
  4     local_var int;
  5
  6     procedure PARAMETER_TESTER(param int) is
  7     begin
  8         dbms_output.put_line('Param came in as: '||param);
  9         glob_var := glob_var + 1;
 10         dbms_output.put_line('Param left as   : '||param);
 11         dbms_output.put_line('glob_var is now : '||glob_var);
 12     end;
 13
 14  begin
 15    parameter_tester(to_char(glob_var));
 16  end;
 17  /
Param came in as: 0
Param left as   : 0
glob_var is now : 1

PL/SQL procedure successfully completed.

SQL>
SQL> set serverout on
SQL> declare
  2
  3     glob_var  int := 0;
  4     local_var int;
  5
  6     procedure PARAMETER_TESTER(param int) is
  7     begin
  8         dbms_output.put_line('Param came in as: '||param);
  9         glob_var := glob_var + 1;
 10         dbms_output.put_line('Param left as   : '||param);
 11         dbms_output.put_line('glob_var is now : '||glob_var);
 12     end;
 13
 14  begin
 15    parameter_tester(to_number(glob_var));
 16  end;
 17  /
Param came in as: 0
Param left as   : 1
glob_var is now : 1

PL/SQL procedure successfully completed.

Notice the last one in particular. Even though I had a TO_NUMBER expression around “glob_var”, it was still passed by reference and hence the parameter value changed throughout the execution of the inner procedure. This is because the PL/SQL compiler detected that the the TO_NUMBER function was redundant and optimized it out during the compilation process. That left just a parameter input of “glob_var” which could then be passed by reference.

Bottom line – it’s generally considered poor programming practice in any language to be mix and matching the scope of variables, in this case, using glob_var both within and outside the inner procedure. Trying to subvert the behaviour of the PL/SQL engine by using expressions such as to_char(glob_var) is a recipe for disaster. Who knows what additional optimizations the next release of the PL/SQL compiler will have? Perhaps it will optimize out “glob_val+1” or to_char(glob_var) and hence pass them by reference etc. If your functionality really demands on coding with these blurred scopes, then make sure you perform an assignment to a local variable and pass that to avoid unexpected side-effects.



SQL> set serverout on
SQL> declare
  2
  3     glob_var  int := 0;
  4     local_var int;
  5
  6     procedure PARAMETER_TESTER(param int) is
  7     begin
  8         dbms_output.put_line('Param came in as: '||param);
  9         glob_var := glob_var + 1;
 10         dbms_output.put_line('Param left as   : '||param);
 11         dbms_output.put_line('glob_var is now : '||glob_var);
 12     end;
 13
 14  begin
 15     local_var := glob_var;
 16     parameter_tester(local_var);
 17  end;
 18  /
Param came in as: 0
Param left as   : 0
glob_var is now : 1

PL/SQL procedure successfully completed.

Quick and easy masking

I had a request from a client a while back regarding masking of data. They had an application with sensitive data in the Production environment (where access and audit were very tightly controlled) but the issue was how to respect that sensitivity in non-Production environments whilst still preserving full size data sizes for application testing.

After some conversations about requirements, it turned out that since (even in non-Production environments) all access to application components was logged and audited, the issue was simply protecting against “inadvertent” access to sensitive data. For example, in the application, if I searched for “males with black hair with an interest in technology” I should never see the name “Connor McDonald” on screen in the results, because simply viewing that data could be treated as a breach of privacy.

Addenda: Anyone adding a comment to this post with “Shouldn’t that be males with black hair with a hint of grey” will be banned forever Smile

I suggested Data Redaction as a potential solution, but they wanted the data actually obfuscated within the database. Then I suggested Data Masking which definitely does the job, but they felt this offered far more than their simple requirement of just obfuscate the data and retain the distribution of data to preserve performance characteristics as much as possible.

So ultimately here is the basis of a little routine I created for them to obfuscate their data to satisfy their requirement of not viewing meaningful data in their non-Production environments. Please read the footnote at the end of this post before you consider using this code. The logic here is straight forward

  • Take the alphanumeric characters in a logical order as a single string.
  • Use DBMS_RANDOM to come up with 255 variations of that string and store them in an array.
  • Use the TRANSLATE function to perform a simple obfuscation of the input string.

That translation could then be used during the process of copying data from Production, or as a post-copy task.

First I’ll demo the concept of coming up with randomised strings. Here’s a simple anonymous block to create 10 random strings based on a base string of: ABC….Zabc…z0123456789


SQL> set serverout on
SQL> declare
  2    type rand_list is table of varchar2(62) index by pls_integer;
  3    source varchar2(62) := 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';
  4    perms  rand_list;
  5    source_variant varchar2(62);
  6  begin
  7    for i in 1 .. 10 loop
  8      source_variant := null;
  9      for j in ( select rownum r from dual connect by level <= 52 order by dbms_random.value )
 10      loop
 11        source_variant := source_variant || substr(source,j.r,1);
 12      end loop;
 13      for j in ( select rownum r from dual connect by level <= 10 order by dbms_random.value )
 14      loop
 15        source_variant := source_variant || substr(source,52+j.r,1);
 16      end loop;
 17      perms(i) := source_variant;
 18      dbms_output.put_line(source_variant);
 19    end loop;
 20  end;
 21  /
xrLCuowbNfvXlMJSBKctPyheZWTnQUjYAdizkFVOIHGsmDRpaEqg5637918402
OkVxApdsLqTlyvgYXQGaSeNPMCjZEJhHRtbcIBrowKUunifzDFmW8013742695
gcVxtNeMdBuRJhvYrnkCIWzoyFwfLpqSATabQGUZKmDOEilPHsXj5601892743
vOUuisITyJjbaQLqfnpFVBoCrYtzHeDAmWExPkNwgScXKRlZGMhd3104879256
VLZkXndmytCWRwKeNgYifvTESzGFhHcqaMJPoljxUsAQrubOpBDI7302189564
VfPLKmObZzBMCpgrjUaQYieRNJykDwtnuWXFxdTEIlScqvohAGsH0138924756
ABMkfozebljnPGqHJhpNwxYEUCuQLyDrOasIgtFZKVmciTXSdvRW5702139864
iNjrmVTlyzFSLswfBvaWKUAHIZOcgMPYXCGqeoDJbhpktERdQunx1976045328
hiYGzxMEZFPcmkutbDlrCeynSITKHNJaXAwfpRsdqLjvQUoWVOgB0479632158
uZpPXOHLAvzricdjtqSQBKVhYoFfmUxEDNCwWlIJsbRgkMeGynaT6451328790

So a simple obfuscation could be:


SQL> select
  2    translate(
  3      'Connor McDonald',
  4      'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789',
  5      'ehZAFjmtYwWilTRDnCcLrvQqEpakKGPMxXyudVUNsoJOBfzSIbHg2163095784')
  6  from dual;

TRANSLATE('CONN
---------------
ZsNNsO lKAsNaVG

Armed with this, I can create a package which extends this to choose from 255 random strings and use these as a simple obfuscator for source data.


SQL> create or replace
  2  package masker is
  3    function rand_masks(p_size int default 255, p_refresh varchar2 default 'N',p_seed number default 0) return  sys.odcivarchar2list pipelined;
  4    function standard_source return varchar2;
  5    function idx_entry(p_idx int, p_size int default 255, p_refresh varchar2 default 'N',p_seed number default 0) return varchar2;
  6    function mask(p_input varchar2) return varchar2;
  7  end;
  8  /

Package created.

SQL> create or replace
  2  package body masker is
  3    perms sys.odcivarchar2list := sys.odcivarchar2list();
  4
  5  procedure init(p_size int,p_seed int) is
  6    source varchar2(62) := 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';
  7    source_variant varchar2(62);
  8  begin
  9    dbms_random.seed(p_seed);
 10    perms.delete;
 11    perms.extend(p_size);
 12    for i in 1 .. p_size loop
 13      source_variant := null;
 14      for j in ( select rownum r from dual connect by level <= 52 order by dbms_random.value )
 15      loop
 16        source_variant := source_variant || substr(source,j.r,1);
 17      end loop;
 18      for j in ( select rownum r from dual connect by level <= 10 order by dbms_random.value )
 19      loop
 20        source_variant := source_variant || substr(source,52+j.r,1);
 21      end loop;
 22      perms(i) := source_variant;
 23  --      dbms_output.put_line(source_variant);
 24    end loop;
 25  end;
 26
 27  function rand_masks(p_size int default 255, p_refresh varchar2 default 'N',p_seed number default 0) return sys.odcivarchar2list pipelined is
 28  begin
 29    if perms.count < p_size or p_refresh in ('Y','y') then
 30      init(p_size,p_seed);
 31    end if;
 32
 33    for i in 1 .. p_size
 34    loop
 35      pipe row ( perms(i));
 36    end loop;
 37
 38    return;
 39  end;
 40
 41  function standard_source return varchar2 is
 42  begin
 43     return 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';
 44  end;
 45
 46  function idx_entry(p_idx int, p_size int default 255, p_refresh varchar2 default 'N',p_seed number default 0) return varchar2 is
 47  begin
 48    if perms.count < p_size or p_refresh in ('Y','y') then
 49      init(p_size,p_seed);
 50    end if;
 51    return perms(p_idx);
 52  end;
 53
 54  function mask(p_input varchar2) return varchar2 is
 55  begin
 56    return translate(p_input,masker.standard_source,masker.idx_entry(ascii(p_input)));
 57  end;
 58
 59  end;
 60  /

Package body created.

Now I’ll test this with some sample data


SQL> create table t ( pk int, n varchar2(50));

Table created.

SQL>
SQL> insert into t values (1,'Connor McDonald');

1 row created.

SQL> insert into t values (2,'Chris Saxon');

1 row created.

SQL> insert into t values (3,'Maria Colgan');

1 row created.

SQL> insert into t values (4,'Bobby Curtis');

1 row created.

SQL> insert into t values (5,'Joel Kallman');

1 row created.

SQL> insert into t values (6,'Steven Feuerstein');

1 row created.

SQL> insert into t values (7,'Connor McDonald');

1 row created.

SQL> select pk, n, masker.mask(n) diddled
  2  from t;

        PK N                                                  DIDDLED
---------- -------------------------------------------------- ------------------------------
         1 Connor McDonald                                    sAJJAe CvnAJjWt
         2 Chris Saxon                                        sweOy RjrAJ
         3 Maria Colgan                                       czEJz BhMbzm
         4 Bobby Curtis                                       nkjjI EpzLBS
         5 Joel Kallman                                       oYfi luiiIuj
         6 Steven Feuerstein                                  CyUrUE SUtUWQyUXE
         7 Connor McDonald                                    sAJJAe CvnAJjWt

7 rows selected.

There we go! A simple obfuscator that runs quite efficiently on source data whilst preserving the distribution of the data.

Footnote: Remember that any masking scheme that does not completely randomize the obfuscation of  source data is not a complete encryption or security solution. Most mechanisms to deduce source data from a manipulated variant of that data involves distribution analysis of letters, n-grams, words and phrases. The moment you preserve distribution of source data for (say) performance and/or optimizer plan preservation, you are allowing for source data to be derived if there is enough of it available to perform that analysis. Remember that this is not a substitute for appropriate security and encryption protections.

APEX Upgrade redux

I posted about my APEX upgrade to 19 yesterday, and someone was quick to point out to me that they believed I hadn’t covered all of the steps.

“What if your APEX instance needs to call web services?” they said. “You need to update your Access Control Lists.”

I hadn’t thought of that, so I logged onto one of my other APEX instances that was still at version 18, and checked the current ACLs



SYS> select acl,
  2         principal,
  3         privilege
  4  from   dba_network_acl_privileges
  5  order by acl, principal, privilege;

ACL                                                PRINCIPAL            PRIVILEGE
-------------------------------------------------- -------------------- ----------
/sys/acls/oracle-sysman-ocm-Resolve-Access.xml     ORACLE_OCM           resolve
NETWORK_ACL_192DBA180AEB40AD98A73ACCD309FF8F       APEX_180200          http
NETWORK_ACL_296C00CF7F2744BAB526D4C4E85FE189       GGSYS                resolve
NETWORK_ACL_296C00CF7F2744BAB526D4C4E85FE189       GSMADMIN_INTERNAL    resolve

4 rows selected.

I can see the potential issue here. On upgrade, I’ll have a new schema which would need the same ACLs granted as the existing APEX 18 schema. Then I went ahead and upgraded this instance to version 19, and lo and behold, check out this nice little touch in the upgrade.


SYS> select acl,
  2         principal,
  3         privilege
  4  from   dba_network_acl_privileges
  5  order by acl, principal, privilege;

ACL                                                PRINCIPAL            PRIVILEGE
-------------------------------------------------- -------------------- ----------
/sys/acls/oracle-sysman-ocm-Resolve-Access.xml     ORACLE_OCM           resolve
NETWORK_ACL_0F93A8F653EC43DC9D90457B1151A330       APEX_190100          http
NETWORK_ACL_192DBA180AEB40AD98A73ACCD309FF8F       APEX_180200          http
NETWORK_ACL_296C00CF7F2744BAB526D4C4E85FE189       GGSYS                resolve
NETWORK_ACL_296C00CF7F2744BAB526D4C4E85FE189       GSMADMIN_INTERNAL    resolve

5 rows selected.

The upgrade took care of the ACLs for me! That’s pretty cool.

Application Express 19.1

AskTOM moved to Application Express 19.1 without any major issues last weekend. That in itself is a nice endorsement for APEX, given that the AskTOM application dates back nearly 20 years to 2001, and predates even the existence of APEX.

The only fix that we had to make was that AskTOM uses the static CDN files that Joel Kallman blogged about to make it nice and snappy wherever in the world it is used. The reference to those files have a hard-coded version number so that needed to updated. For AskTOM, we have a plugin that uses some jQuery elements that went pear-shaped when referencing the old version 18 files, but after a quick fix to that reference all was well.

image

Given that AskTOM is running on APEX 19 I figured I best upgrade my local PC installation as well. I find the standard documentation a little bit cryptic when I want to perform an upgrade because the docs have to cover all of the possible scenarios (CDB vs non-CDB, web listener vs EPG, etc), and as such when you click on the section on Upgrading, you don’t appear to the get a step by step guide on what to do.

image

But the secret is in the second paragraph:

“following any of the installations scenarios in the guide upgrades your Oracle Application Express instance…”

so here’s what I did to upgrade my local installation which is installed within a pluggable database, and running with a standalone ORDS listener.  (I stress, always consult the standard documentation set before commencing your upgrade).

  • Downloaded APEX and unzipped into an appropriate location
  • Connected as SYSDBA in the pluggable database where APEX is installed
  • Ran the standard installation script (which will automatically detected this be an upgrade)
@apexins.sql SYSAUX SYSAUX TEMP /i/
  • Let ORDS know that an upgrade had been done.

java -jar ords.war validate
  • And the reset the REST services component post-upgrade

@apex_rest_config.sql

and I was done! No dramas encountered and APEX on my PC is now on 19.1

Note: I moved my original APEX installation to a backup area, and placed the new version 19 installation in place of the old location, so I did not have to move or copy any images and static files around. Your approach may be different, so just to reiterate – consult the standard docs before diving into your upgrade.

Of course, all the dark mode fan bois will be telling me to switch to Dark Mode Smile

image

but for me, I think it would be wrong to maintain a classic 2000’s application such as AskTOM in anything but brilliant white Smile

More chances to bulk process

I’m sure most of us have read or heard at a conference the benefits of array fetching and array binding when it comes to passing data back and forth to the database. And we’ve all seen the numerous demo scripts in PL/SQL along the lines of:


FORALL i in 1 .. n 
   INSERT ...

As such, there is a misconception out there that you are only going to be able to use bulk binding for basic DML operations. So I thought I’d share this example that came in via AskTOM recently. We got asked if there was any means of improving the performance of this row-by-row operation where the DML was a complex Text index search, with the additional complication that on a row by row basis, the DML may fail but that this was an anticipated outcome that needed to be handled and moved past. The scenario presented was as follows:

  • A table SOURCE_DATA containing rows of free format text,
  • A table SEARCH_TERMS that would be populated by users, applications indicating a list of words/phrases that would searched for within SOURCE_DATA,
  • A table SEARCH_RESULTS that would contain a one row per search term to indicate the phrase was found in SOURCE_DATA.

To enable text searching, a text index is created on SOURCE DATA, otherwise the setup below is straightforward.


SQL> create table SOURCE_DATA
  2  ( id number,
  3    name varchar2(50) );

Table created.

SQL>
SQL> create sequence SOURCE_SEQ start with 1 increment by 1;

Sequence created.

SQL>
SQL> insert into SOURCE_DATA
  2  select SOURCE_SEQ.nextval,'abc'||rownum from dual
  3  connect by rownum<=10000;

10000 rows created.

SQL> insert into SOURCE_DATA
  2  select SOURCE_SEQ.nextval,'pqr'||rownum from dual
  3  connect by rownum<=10000;

10000 rows created.

SQL> insert into SOURCE_DATA
  2  select SOURCE_SEQ.nextval,'xyz'||rownum from dual
  3  connect by rownum<=10000;

10000 rows created.

SQL> insert into SOURCE_DATA
  2  select SOURCE_SEQ.nextval,'name'||rownum from dual
  3  connect by rownum<=50000;

50000 rows created.

SQL> commit;

Commit complete.

SQL>
SQL> exec ctx_ddl.create_stoplist('keep_all_stopwords', 'BASIC_STOPLIST');

PL/SQL procedure successfully completed.

SQL>
SQL> create index SOURCE_IDX_TEXT on SOURCE_DATA(name)
  2  indextype is ctxsys.context
  3  parameters ('stoplist keep_all_stopwords sync (on commit)');

Index created.

SQL>
SQL> create table SEARCH_TERMS
  2  ( search_name varchar2(50) );

Table created.

SQL>
SQL> create table SEARCH_RESULTS
  2  ( search_name varchar2(50) );

Table created.

SQL>
SQL> insert into SEARCH_TERMS values ('xyz1');

1 row created.

SQL> insert into SEARCH_TERMS values ('xyz10000');

1 row created.

SQL> insert into SEARCH_TERMS values ('n');

1 row created.

SQL> commit;

Commit complete.

With the data above, the intent here is to do a wildcard text search in SOURCE_DATA for the value “xyz1”, and then a wildcard text search for “xyz10000” and so forth for each row in SEARCH_TERMS. Here is the first cut of the code provided by the poster on AskTOM. We loop around for each entry in SEARCH_TERMS and perform an INSERT-WHERE-EXISTS query. However, because this is a wild-card search, then it is possible for errors to be returned from a Text query, which necessitates the exception handler in the code. We can see how this can manifests itself with a sample run.


SQL>
SQL> create or replace
  2  procedure testInsert as
  3    v_errcode NUMBER;
  4    v_errm VARCHAR2(200);
  5  begin
  6    for a in ( select * from SEARCH_TERMS )
  7    loop
  8      dbms_output.put_line('Loading-' || a.search_name);
  9      begin
 10       insert into SEARCH_RESULTS
 11       select a.search_name
 12       from dual
 13       where  (
 14         select count(*)
 15         from SOURCE_DATA b
 16         where contains(b.name, '%' || a.search_name || '%') > 0
 17         and rownum <= 2 ) = 1;
 18
 19     exception
 20       when others then
 21          v_errcode := sqlcode;
 22          v_errm := substr(sqlerrm, 1, 200);
 23          dbms_output.put_line('Error code ' || v_errcode || ': ' || v_errm);
 24     end;
 25    end loop;
 26
 27  end;
 28  /

Procedure created.

SQL>
SQL> set serverout on
SQL> exec testInsert
Loading-xyz1
Loading-xyz10000
Loading-n
Error code -29902: ORA-29902: error in executing ODCIIndexStart() routine
ORA-20000: Oracle Text error:
DRG-51030: wildcard query expansion resulted in too many terms

PL/SQL procedure successfully completed.

For the first two search terms, the check works fine, but for the search term of “n”, it is deemed “too vague” by Text engine and returns the error “DRG-51030: wildcard query expansion resulted in too many terms”. But since this is an expected error (since search phrases come from an arbitrary source) we catch the error and move on to the next phrase. The poster on AskTOM was looking for a mechanism to speed this up, since once there was a large number of search phrases, the row-by-row approach became the familiar cliché “slow-by-slow”.

But even with a scenario like this, array processing via bulk binding can be utilised. I’ve recoded the example to use bulk binding. Even though we have a complex SQL with a Text query, along with the need for an error handler, we can still take advantage of array processing. Using the SQL%BULK_EXCEPTIONS structure, we still get access to rows in error.


SQL>
SQL> create or replace
  2  procedure testInsert as
  3    v_errcode number;
  4    v_errm varchar2(200);
  5
  6    type row_list is table of varchar2(100) index by pls_integer;
  7    r row_list;
  8
  9    bulk_failed exception;
 10    pragma exception_init(bulk_failed, -24381);
 11
 12    l_cnt int;
 13  begin
 14    select search_name bulk collect into r from SEARCH_TERMS;
 15
 16    forall i in 1 .. r.count save exceptions
 17       insert into SEARCH_RESULTS
 18       select r(i)
 19       from dual
 20       where  (
 21         select count(*)
 22         from SOURCE_DATA b
 23         where contains(b.name, '%' || r(i) || '%') > 0
 24         and rownum <= 2 ) = 1;
 25
 26  exception
 27   when bulk_failed then
 28      l_cnt := sql%bulk_exceptions.count;
 29      for i in 1 .. l_cnt loop
 30        dbms_output.put_line('failed: ' || i || ' value: ' ||
 31            r(sql%bulk_exceptions(i).error_index) ||' : ' ||
 32            sqlerrm(-sql%bulk_exceptions(i).error_code));
 33      end loop;
 34  end;
 35  /

Procedure created.

SQL>
SQL> set serverout on
SQL> exec testInsert
failed: 1 value: n : ORA-29902: error in executing ODCIIndexStart() routine

PL/SQL procedure successfully completed.

SQL>

So don’t be too quick to dismiss the opportunities to use bulk binding in your applications. If you can code the SQL in PL/SQL, you can probably (re)code to use array processing.

Determined on Determinism

I’m feeling very determined on this one. Yes I have a lot of determination to inform blog readers about determinism, and yes I have run out of words that sound like DETERMINISTIC. But one of the most common misconceptions I see for PL/SQL functions is that developers treat them as if they were “extending” the existing database kernel. By this I mean that developers often assume that wherever an existing in-built function (for example TO_NUMBER or SUBSTR etc) could be used, then a PL/SQL function of their own creation will work in the exactly the same way.

Often that will be the case, but the most common scenario I see tripping up people is using PL/SQL functions within SQL statements. Consider the following simple example, where a PL/SQL function is utilizing the in-built SYSTIMESTAMP and TO_CHAR functions.


SQL> create or replace
  2  function f(i varchar2) return varchar2 is
  3  begin
  4    return i||'-'||to_char(systimestamp,'HH24MISS:FF');
  5    --dbms_lock.sleep(0.5);
  6  end;
  7  /

Function created.

Let us compare the output from the function when used within a SQL statement, with the results from same built-in functions used directly from the SQL statement.


SQL> select rownum, to_char(systimestamp,'HH24MISS:FF') x1, f(rownum) x2
  2  from   dual
  3  connect by level <= 9;

    ROWNUM X1                             X2
---------- ------------------------------ ------------------------------
         1 181557:351000                  1-181557:351000000
         2 181557:351000                  2-181557:361000000
         3 181557:351000                  3-181557:361000000
         4 181557:351000                  4-181557:364000000
         5 181557:351000                  5-181557:364000000
         6 181557:351000                  6-181557:366000000
         7 181557:351000                  7-181557:366000000
         8 181557:351000                  8-181557:372000000
         9 181557:351000                  9-181557:372000000

9 rows selected.

A direct call to SYSTIMESTAMP is fixed for the duration of the execution of a SQL statement, but this is NOT the case for the SYSTIMESTAMP call made within the PL/SQL function. The PL/SQL function is being called multiple times during the single execution of the SQL statement, and hence each execution is totally entitled to return a “fresh” result from SYSTIMESTAMP.

Moreover, the database makes no guarantees that a PL/SQL function will be called once per row encountered in a SQL statement, so if your PL/SQL function changes session state in some way (for example, a package variable) then you can never assume that there will be a 1-to-1 relationship between rows processed and PL/SQL function executions.

The only way to be sure that you won’t get unexpected results from PL/SQL function calls within SQL is for those functions to be deterministic, and responsibility for that lies entirely with the developer not with the database. So please don’t think that the solution to this is just throwing in the DETERMINISTIC keyword. You need to inspect your code and ensure you won’t get spurious results from that PL/SQL function when used from SQL.

Long running scheduler jobs

One of the nice things about the job scheduler in the Oracle database is the easily interpreted interval settings you can apply for job frequency. The days of cryptic strings like “sysdate+0.000694444” when all you really wanted to say was “Just run this job every minute” are a thing of the past. I covered how to get the database to convert interval strings into real execution dates here 

But it raises the question: What if I have a job that is scheduled to run every minute, but it takes more than 1 minute to run? Will the scheduler just crank out more and more concurrent executions of that job? Will I swamp my system with ever more background jobs? So I thought I’d find out with a simple test.

I created a table which will record the start and end time for executions of a procedure, and then crafted that procedure to always run for at least 2 minutes using dbms_lock.sleep. (If you are on 18c, you can replace this with dbms_session.sleep to avoid the need for an explicit grant.) Then I set this procedure to be run every minute via dbms_scheduler.


SQL> create table t ( tag varchar2(10), d date);

Table created.

SQL>
SQL> create or replace
  2  procedure P is
  3  begin
  4    insert into t values ('start',sysdate);
  5    commit;
  6    dbms_lock.sleep(120);
  7    insert into t values ('end',sysdate);
  8    commit;
  9  end;
 10  /

Procedure created.


SQL> begin
  2    dbms_scheduler.create_job (
  3      job_name        => 'JOB1',
  4      job_type        => 'PLSQL_BLOCK',
  5      job_action      => 'begin p; end;',
  6      start_date      => systimestamp,
  7      repeat_interval => 'freq=minutely;bysecond=0;',
  8      enabled         => true);
  9  end;
 10  /

PL/SQL procedure successfully completed.

I waited 20 minutes and then looked at both my table and the scheduler logs to see how many concurrent executions were recorded.


SQL> select * from t order by d, tag;

TAG        D
---------- -------------------
start      25/03/2019 21:36:00
end        25/03/2019 21:38:00
start      25/03/2019 21:38:00
end        25/03/2019 21:40:00
start      25/03/2019 21:40:00
end        25/03/2019 21:42:00
start      25/03/2019 21:42:01
end        25/03/2019 21:44:01
start      25/03/2019 21:44:01
end        25/03/2019 21:46:01
start      25/03/2019 21:46:01
end        25/03/2019 21:48:01
start      25/03/2019 21:48:01
end        25/03/2019 21:50:01
start      25/03/2019 21:50:01
end        25/03/2019 21:52:01
start      25/03/2019 21:52:01
end        25/03/2019 21:54:01

18 rows selected.

SQL> select log_date
  2  from   dba_scheduler_job_log
  3  where job_name = 'JOB1'
  4  order by log_date;

LOG_DATE
-----------------------------------------------
25-MAR-19 09.38.00.866000 PM +08:00
25-MAR-19 09.40.00.920000 PM +08:00
25-MAR-19 09.42.00.998000 PM +08:00
25-MAR-19 09.44.01.037000 PM +08:00
25-MAR-19 09.46.01.078000 PM +08:00
25-MAR-19 09.48.01.143000 PM +08:00
25-MAR-19 09.50.01.171000 PM +08:00
25-MAR-19 09.52.01.206000 PM +08:00
25-MAR-19 09.54.01.272000 PM +08:00

9 rows selected.

As you can see, the scheduler is not going to swamp your system. It will not run the “next” occurrence of your submitted job until the current execution has completed. So even though we requested an execution each minute, we are bound by the run time duration of the job itself. Once additional nice thing is that once the job has finished, the scheduler immediately sees that the next execution is overdue and launches the next job straight away. So no need to worry about an excessive number of jobs all running together.

Just as an aside, when you have a scheduler job that is “always” in a running state, then you need to take care when dropping the job because by default you cannot drop a running job. In such instances, you can always add the FORCE parameter to kill the current execution and remove the job from the scheduler.


SQL> exec dbms_scheduler.drop_job('JOB1')
BEGIN dbms_scheduler.drop_job('JOB1'); END;

*
ERROR at line 1:
ORA-27478: job "MCDONAC"."JOB1" is running
ORA-06512: at "SYS.DBMS_ISCHED", line 274
ORA-06512: at "SYS.DBMS_SCHEDULER", line 753
ORA-06512: at line 1


SQL> exec dbms_scheduler.drop_job('JOB1',force=>true)

PL/SQL procedure successfully completed.