Complex materialized views and fast refresh

Just a quick discovery that came across the AskTOM “desk” recently. We have an outstanding bug in some instances of fast refresh materialized views when the definition of the materialized view references a standard view.

Here’s a simple demo of the issue – I’ll use a simplified version of the EMP and DEPT tables, linked by a foreign key in the usual way:


SQL> create table dept(dept_id number(10) primary key, dname varchar2(20));
 
Table created.
 
SQL> create table emp(empid number(20) primary key, ename varchar2(20), sal number(10,2), dept_id number(10) references dept(dept_id));
 
Table created.
 
SQL> insert into dept values(10,'IT');
 
1 row created.
 
SQL> insert into dept values(20,'HR');
 
1 row created.
 
SQL> insert into dept values(30,'MAT');
 
1 row created.
 
SQL> insert into emp values(1,'MIKE',20000,10);
 
1 row created.
 
SQL> insert into emp values(2,'JOHN',30000,20);
 
1 row created.
 
SQL> insert into emp values(3,'SUE',20000,20);
 
1 row created.
 
SQL> insert into emp values(4,'TOM',40000,30);
 
1 row created.
 
SQL>
SQL> commit;
 
Commit complete.

One of the cool things with materialized views is that even with complicated SQL definitions (such as joins), the materialized view can still be fast refreshable as long as the materialized view logs and database constraints are correctly defined.


SQL> create materialized view log on dept
  2  with rowid, primary key, sequence
  3  including new values;
 
Materialized view log created.
 
SQL>
SQL> create materialized view log on emp
  2  with rowid, primary key, sequence
  3  including new values;
 
Materialized view log created.
 
SQL> create materialized view mv
  2  --build deferred
  3  refresh fast on demand
  4  with primary key
  5  enable query rewrite
  6  as
  7  select a.rowid erowid,b.rowid drowid ,b.dname, a.*
  8  from emp a, dept b
  9  where a.dept_id=b.dept_id;
 
Materialized view created.
 

Note: See the documentation for DBMS_MVIEW.EXPLAIN_MVIEW for how to check on the refresh characteristics of a materialized view (or potential materialized view).

Now I’ll repeat the same experiment, but I’ll wrap that SQL that joins EMP and DEPT within a standard view called VW. Since a view is just stored SQL text, and the previous usage of the same SQL worked fine, we’d expect no difference in functionality.  However, the results do not meet the expectation.


SQL> drop materialized view mv ;
 
Materialized view dropped.
 
SQL>
SQL> create or replace view vw
  2  as
  3  select a.rowid erowid,b.rowid drowid ,b.dname, a.*
  4  from emp a, dept b
  5  where a.dept_id=b.dept_id;
 
View created.
 
SQL>
SQL> create materialized view mv
  2  --build deferred
  3  refresh fast on demand
  4  with primary key
  5  enable query rewrite
  6  as
  7  select * from vw;
select * from vw
              *
ERROR at line 7:
ORA-12015: cannot create a fast refresh materialized view from a complex query

This is not a flaw in our DDL – it is a bug in the database that will be fixed in due course. So if you have standard views being used within your materialized view definitions, and you are getting unexpected restrictions on whether the materialized views can be fast refreshed, try a simple workaround of in-lining the view text directly.  You might have hit this bug.

Standard Edition–different optimizer but still cool

One cool technique that the optimizer can employ is the BITMAP CONVERSION TO ROWIDS method to take advantage of B-tree indexes in a means that we would normally associate with a bitmap index. This can be particularly useful with multiple predicates on individually indexed columns because it lets us establish the rows of interest before having to visit the heap blocks.  Here’s an example of that in action, even when the indexes in question are Text indexes.

Enterprise Edition plan


SQL> create table MY_TAB ( col1 varchar2(50), col2 varchar2(50));

Table created.

SQL>
SQL> insert into MY_TAB
  2  select dbms_random.string('x', 30), dbms_random.string('x', 30)
  3  from dual
  4  connect by level <= 100000;

100000 rows created.

SQL>
SQL> commit;

Commit complete.

SQL>
SQL> create index MY_TAB_IX1 on MY_TAB(col1) indextype is ctxsys.context;

Index created.

SQL> create index MY_TAB_IX2 on MY_TAB(col2) indextype is ctxsys.context;

Index created.

SQL>
SQL> exec  dbms_stats.gather_table_stats('','MY_TAB');

PL/SQL procedure successfully completed.

SQL> set autotrace traceonly explain
SQL> select * from MY_TAB where contains(col1, 'string1') > 0 or contains(col2, 'string2') > 0;

Execution Plan
----------------------------------------------------------
Plan hash value: 4174159475

---------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |             |     1 |    62 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| MY_TAB      |     1 |    62 |     2   (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS       |             |       |       |            |          |
|   3 |    BITMAP OR                        |             |       |       |            |          |
|   4 |     BITMAP CONVERSION FROM ROWIDS   |             |       |       |            |          |
|   5 |      SORT ORDER BY                  |             |       |       |            |          |
|*  6 |       DOMAIN INDEX                  | MY_TAB_IX1  |       |       |     1   (0)| 00:00:01 |
|   7 |     BITMAP CONVERSION FROM ROWIDS   |             |       |       |            |          |
|   8 |      SORT ORDER BY                  |             |       |       |            |          |
|*  9 |       DOMAIN INDEX                  | MY_TAB_IX2  |       |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access("CTXSYS"."CONTAINS"("COL1",'string1')>0)
   9 - access("CTXSYS"."CONTAINS"("COL2",'string2')>0)

But a quick trip to the licensing guide states the following about the optimizer in Standard Edition:

The following methods are not available in SE:

Bitmapped index, bitmapped join index, and bitmap plan conversions

which make sense given that you do not have access to bitmap indexes in Standard Edition. 

But all is not lost.  Even though the bitmap conversion optimization is not available, the optimizer can still come up with alternate and intelligent mechanisms to extract the required rows. Whereas you might think that an OR condition on two different columns might force a full table scan, here is the same example as above in Standard Edition.

Standard Edition plan


SQL> set autotrace traceonly explain
SQL> select * from MY_TAB where contains(col1, 'string1') > 0 or contains(col2, 'string2') > 0;

Execution Plan
----------------------------------------------------------
Plan hash value: 1568130183

-------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                 |     2 |   108 |     2   (0)| 00:00:01 |
|   1 |  VIEW                         | VW_ORE_A5827389 |     2 |   108 |     2   (0)| 00:00:01 |
|   2 |   UNION-ALL                   |                 |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID| MY_TAB          |     1 |    62 |     1   (0)| 00:00:01 |
|*  4 |     DOMAIN INDEX              | MY_TAB_IX1      |       |       |     1   (0)| 00:00:01 |
|*  5 |    TABLE ACCESS BY INDEX ROWID| MY_TAB          |     1 |    62 |     1   (0)| 00:00:01 |
|*  6 |     DOMAIN INDEX              | MY_TAB_IX2      |       |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("CTXSYS"."CONTAINS"("COL1",'string1')>0)
   5 - filter(LNNVL("CTXSYS"."CONTAINS"("COL1",'string1')>0))
   6 - access("CTXSYS"."CONTAINS"("COL2",'string2')>0)

SQL> set autotrace off

A “UNION ALL” suggests the two separate table access paths will potentially return duplicate rows. But notice the nice touch on line 5 – when probing the rows via MY_TAB_IDX2, the optimizer threw in an additional FILTER (LNNVL(“CTXSYS”.”CONTAINS”(“COL1”,’string1′)>0)) to remove those rows that will be returned by the the “partnering” half of the UNION ALL.  In this way, we avoided a potentially expensive sort to remove duplicated results.

So you might see the occasional difference between optimizer plans for Enterprise edition versus Standard edition – but both should do a fine job at executing your queries Smile

(Both tests done on 12.2)

UTL_FILE_DIR and 18c

I wrote a blog post called The Death of UTL_FILE which attracted a comment from a reader:

“There is NO chance to stay at UTL_FILE as it is DESUPPORTED starting with database Version 18c”

This is not the case, but since I wanted to clarify what has changed in 18c, it warrants this small but separate blog post. When UTL_FILE first into existence in Oracle 7, the concept of directory object did not apply to UTL_FILE. Clearly we could not just let UTL_FILE to write to any destination, otherwise a malicious person could write a little PL/SQL block like this:


declare
  f utl_file.file_type;
begin
  for i in ( select 
                regexp_substr(name,'(.*)\\(.*)', 1, 1, 'i', 1) path,
                regexp_substr(name,'(.*)\\(.*)', 1, 1, 'i', 2) name
             from v$datafile 
             order by file# desc ) 
  loop
    f := utl_file.fopen(i.path,i.name,'W');
  end loop;
  end;
  utl_file.fclose_all;
end;
/

and voila! No more database. So an initialization parameter was created to nominate which directories UTL_FILE was allowed to access.  This parameter was called UTL_FILE_DIR, and the above code hopefully is justification enough to show that you should never ever set UTL_FILE_DIR to “*”, meaning it could write to wherever the OS permissions on the Oracle software account would allow it.  Bye Bye datafiles…bye bye database Smile

In more recent versions, UTL_FILE was improved so that the directory parameter could be supplied as a directory object. This is a much tighter implementation because read and write privileges on directory objects can be controlled from within the database.

In 18c, it is not UTL_FILE that has been de-supported, it is the ability to use the older style convention of hard-coded path names for the directory that is no longer allowed. You have to use directory objects. If you set the traditional UTL_FILE_DIR path in the spfile it will be ignored, and you’ll get a warning on startup.


SQL> alter system set utl_file_dir = 'c:\temp' scope=spfile;

System altered.


SQL> startup force
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORACLE instance started.


--
-- alert log
--
Obsolete system parameters with specified values:
  utl_file_dir
End of obsolete system parameter listing

But even if you set UTL_FILE_DIR in the spfile, you will not be allowed to use OS directory paths in the UTL_FILE dir calls.


SQL> declare
  2    f utl_file.file_type;
  3  begin
  4    f := utl_file.fopen('c:\temp','demo.dat','W');
  5    utl_file.fclose_all;
  6  end;
  7  /
declare
*
ERROR at line 1:
ORA-29280: invalid directory object
ORA-06512: at "SYS.UTL_FILE", line 41
ORA-06512: at "SYS.UTL_FILE", line 478
ORA-06512: at line 4

If you prefer a video version of this explanation, I had some fun riding my bike when I talked about this when 18c first came out earlier this year.

More triggers are better

Yes, you heard me correctly. If you have got one trigger on a table, then you might be surprised to find that perhaps having a second one will be a better option. Then again, I also love the sweet scent of a clickbaity, inflammatory blog post title to draw the readers in Smile so you’ll just have to read on to see which is true.

As much as I am not a fan of using triggers to populate metadata in tables, I also concede that it is a popular approach taken by many of us as we code up our applications. Sure, we’d like to have our application code store that all important “who did this change” information with each database row, but more often than not, this get shoe-horned into the codebase via triggers as an afterthought to the development process.

But our well-read developer will often be thinking: “I’m a savvy developer – I’ve been reading blogs, and websites that tell me that less triggers is better, so I’ll use as few as I can” which leads to a scenario like the one I’ll present now.

I’ll start with a standard parent and child table setup, connected via an obvious foreign key


SQL> create table par (
  2    id           number(10)    not null primary key,
  3    description  varchar2(20)  not null,
  4    created_by   varchar2(20)  not null,
  5    updated_by   varchar2(20)  not null
  6    );

Table created.

SQL>
SQL> create sequence par_seq;

Sequence created.

SQL>
SQL> create sequence chd_seq;

Sequence created.

SQL>
SQL> create table chd (
  2    id           number(10)    not null primary key,
  3    par_id       number(10)    not null,
  4    constraint  fk foreign key ( par_id ) references par ( id )
  5    );

Table created.

SQL>
SQL>
SQL> create index chd_fk_idx on chd (par_id);

Index created.

I’ve got a sequence for each table, so in order to “bind” that sequence to the primary key for each table I’ll create a trigger.  So I’ll add some trigger code to implement some other common application requirements at the same time:

  • Populate the primary key with a sequence value,
  • Populate the CREATED_BY, UPDATED_BY columns on insert of a new row
  • Amend the UPDATED_BY column when I later update that row

Here is a trigger to implement that.


SQL> create or replace trigger trg_par
  2  before insert or update on par
  3  for each row
  4  begin
  5      if inserting then
  6         :new.id := par_seq.nextval;
  7         :new.created_by := user;
  8         :new.updated_by := user;
  9      end if;
 10
 11      if updating then
 12         :new.updated_by := user;
 13      end if;
 14  end;
 15  /

Trigger created.

Now my application is ready go.  I’ll insert some data to verify that my trigger has not broken any functionality.


SQL> insert into par (description) values ('test');

1 row created.

SQL> insert into par (description) values ('test2');

1 row created.

SQL> select * from par;

  ID DESCRIPTION          CREATED_BY           UPDATED_BY
---- -------------------- -------------------- --------------------
   1 test                 MCDONAC              MCDONAC
   2 test2                MCDONAC              MCDONAC

2 rows selected.

SQL>
SQL> insert into chd (id,par_id) values (chd_seq.nextval,1);

1 row created.

SQL> insert into chd (id,par_id) values (chd_seq.nextval,2);

1 row created.

SQL> select * from chd;

  ID     PAR_ID
---- ----------
   1          1
   2          2

2 rows selected.

SQL>
SQL> commit;

Commit complete.

So far so good. Let us see now what happens when I do a simple update on the parent table on the DESCRIPTION column.  Note that this column is not involved with any primary key, or index or foreign key relationship – it’s just a simple string column.


SQL> update par set description = 'anything' where id = 1;

1 row updated.

SQL> select locked_mode, object_name
  2  from v$locked_object l, all_objects ob
  3  where ob.object_id =l.object_id;

LOCKED_MODE OBJECT_NAME
----------- ----------------------------------------
          3 PAR
          3 CHD

2 rows selected.

Notice the locks that have been taken. Understandably, I had to take a lock on the PAR table because I’ve just updated a row on the table, so I need to make sure that no-one does anything like drop it, or modify the structure whilst I’ve got an outstanding transaction. But here is the interesting part – we also took a lock on the child table CHD. That seems superfluous because as I said – we are not performing an update of anything that could possibly impact the child table in any way. 

Why did the database choose to add this extra lock?

The cause is due to the way I coded the trigger.  Notice that the trigger has a reference to the primary key in the trigger body:


  5      if inserting then
  6         :new.id := par_seq.nextval;

Even though that part of the trigger body is not executed (because I am performing an update not an insert) the mere presence of it is enough for the database to cautiously add that extra lock on the CHD table because we “just might” be messing around with the parent table primary key here.

So now I’ll roll back that update and convert the single triggers to two triggers – one for insert and one dedicated for update.


SQL> rollback;

Rollback complete.

SQL>
SQL> drop trigger trg_par ;

Trigger dropped.

SQL>
SQL> create or replace trigger trg1_par
  2  before insert on par
  3  for each row
  4  begin
  5      :new.id := par_seq.nextval;
  6      :new.created_by := user;
  7      :new.updated_by := user;
  8  end;
  9  /

Trigger created.

SQL>
SQL> create or replace trigger trg2_par
  2  before update on par
  3  for each row
  4  begin
  5      if updating then
  6          :new.updated_by := user;
  7      end if;
  8  end;
  9  /

Trigger created.

Now I’ll run the same update


SQL> update par set description = 'anything' where id = 1;

1 row updated.

SQL> select locked_mode, object_name
  2  from v$locked_object l, all_objects ob
  3  where ob.object_id =l.object_id;

LOCKED_MODE OBJECT_NAME
----------- ----------------------------------------
          3 PAR

1 row selected.

Because the reference to the primary key columns for table PAR are no longer in the firing (update) trigger, I no longer longer see the lock on the CHD table.

In practice, it is unlikely that this additional lock is going to cause you a lot of harm – it will block some operations on the child table such as a truncate or DDL to make a structural change, so it is reasonably to assume that these are rare occurrences. But I’m a fan of the philosophy of: Have as few locks as possible, but always as many as required. So in this case, you might to want to consider opting for two triggers instead of one.

And of course, perhaps getting to zero triggers might be the best option Smile

DDL for constraints – subtle things

The DBMS_METADATA package is very cool. I remember the days of either hand-crafting DDL statements based on queries to the data dictionary, or many a DBA will be familiar with running “imp show=y” or “imp indexfile=…” in order to then laboriously extract the DDL required from the import log file.  DBMS_METADATA removed all of those annoyances to give us a simple API to get the true and complete DDL for a database object.

But when extracting DDL from the database using the DBMS_METADATA package, you need to be aware of some subtleties especially if you plan on executing that DDL in the database.

Consider this example – I have a few tables and I want to extract the referential integrity constraints for one of them. Being a good cautious developer Smile I’ll just output the DDL first before attempt to do any execution of the statements:


SQL> create table tab1(id number, name varchar2(100),
  2                      constraint pk_tab1_id primary key(id));

Table created.

SQL> create table tab2(id number, name varchar2(100),
  2                      constraint pk_tab2_id primary key(id));

Table created.

SQL> create table tab3(id number, name varchar2(100), int_id number,
  2                      constraint pk_tab3_id primary key(id),
  3                      constraint fk_tab1_id foreign key(int_id) references tab1(id),
  4                      constraint fk_tab2_id foreign key(int_id) references tab2(id));

Table created.

SQL>
SQL> set serverout on
SQL> begin
  2    for i in (
  3      select t.table_name,
  4          dbms_metadata.get_dependent_ddl('REF_CONSTRAINT', table_name) ddl
  5      from   user_tables t
  6      where  table_name = 'TAB3'
  7      )
  8    loop
  9            dbms_output.put_line(i.ddl);
 10            --execute immediate i.ddl;
 11    end loop;
 12  end;
 13  /

  ALTER TABLE "MCDONAC"."TAB3" ADD CONSTRAINT "FK_TAB1_ID" FOREIGN KEY ("INT_ID") REFERENCES "MCDONAC"."TAB1" ("ID") ENABLE

ALTER TABLE "MCDONAC"."TAB3" ADD CONSTRAINT "FK_TAB2_ID" FOREIGN KEY ("INT_ID") REFERENCES "MCDONAC"."TAB2" ("ID") ENABLE

PL/SQL procedure successfully completed.

That all looks fine – I have my two ALTER statements ready to go.  So now I’ll comment back in the ‘execute immediate’ command and all should be fine.


SQL> set serverout on
SQL> begin
  2    for i in (
  3      select t.table_name,
  4             dbms_metadata.get_dependent_ddl('REF_CONSTRAINT', table_name) ddl
  5      from   user_tables t
  6      where  table_name = 'TAB3'
  7      )
  8    loop
  9            dbms_output.put_line(i.ddl);
 10            execute immediate i.ddl;
 11    end loop;
 12  end;
 13  /

  ALTER TABLE "MCDONAC"."TAB3" ADD CONSTRAINT "FK_TAB1_ID" FOREIGN KEY ("INT_ID") REFERENCES "MCDONAC"."TAB1" ("ID") ENABLE

ALTER TABLE "MCDONAC"."TAB3" ADD CONSTRAINT "FK_TAB2_ID" FOREIGN KEY ("INT_ID") REFERENCES "MCDONAC"."TAB2" ("ID") ENABLE
begin
*
ERROR at line 1:
ORA-01735: invalid ALTER TABLE option
ORA-06512: at line 10
ORA-06512: at line 10

The seems an odd result. Since in this simple example I’m just running the commands straight back into the same database, I might have expected a “Constraint already exists”, or “Object name exists” style of error, but this is different. This error is telling that the statement is invalid – which obviously should not be the case if it came straight out of DBMS_METADATA. But a simple amendment to my anonymous block will reveal the answer. I will output a line of dashes each time I cycle through the cursor loop



SQL> set serverout on
SQL> begin
  2    for i in (
  3      select t.table_name,
  4             dbms_metadata.get_dependent_ddl('REF_CONSTRAINT', table_name) ddl
  5      from   user_tables t
  6      where  table_name = 'TAB3'
  7      )
  8    loop
  9            dbms_output.put_line('========================================');
 10            dbms_output.put_line(i.ddl);
 11            --execute immediate i.ddl;
 12    end loop;
 13  end;
 14  /
========================================

  ALTER TABLE "MCDONAC"."TAB3" ADD CONSTRAINT "FK_TAB1_ID" FOREIGN KEY ("INT_ID") REFERENCES "MCDONAC"."TAB1" ("ID") ENABLE

ALTER TABLE "MCDONAC"."TAB3" ADD CONSTRAINT "FK_TAB2_ID" FOREIGN KEY ("INT_ID") REFERENCES "MCDONAC"."TAB2" ("ID") ENABLE

PL/SQL procedure successfully completed.

And therein lies the issue. I got two ALTER commands back on screen, but in reality they both came back from a single row fetched from the cursor. The ALTER commands were separated by a carriage return, but if I try to execute that single row, then the statement is invalid because it is an attempt to run two ALTER commands in a single statement. The output looks like two rows from the cursor but it was not.

That is a problem if I want to store that DDL in a table or a file, because I can’t use it as it currently stands, and I don’t want to have to write some scripts to parse that DDL to add semi-colons or split it into multiple commands, because one of the motivations for DBMS_METADATA in the first place was to avoid all that irritation.

There is an easy fix to this. Rather than getting the dependent DDL for a table, we can get the “direct” DDL for the constraints themselves.  In this way, you’ll get a row from the cursor for each constraint, and hence one DDL statement per constraint as well.



SQL> set serverout on
SQL> begin
  2    for i in (
  3         select t.table_name,
  4                dbms_metadata.get_ddl('REF_CONSTRAINT', constraint_name) ddl
  5         from user_constraints t
  6         where table_name = 'TAB3'
  7         and constraint_type = 'R'
  8      )
  9    loop
 10            dbms_output.put_line('========================================');
 11            dbms_output.put_line(i.ddl);
 12            --execute immediate i.ddl;
 13    end loop;
 14  end;
 15  /
========================================

  ALTER TABLE "MCDONAC"."TAB3" ADD CONSTRAINT "FK_TAB1_ID" FOREIGN KEY ("INT_ID") REFERENCES "MCDONAC"."TAB1" ("ID") ENABLE
========================================

  ALTER TABLE "MCDONAC"."TAB3" ADD CONSTRAINT "FK_TAB2_ID" FOREIGN KEY ("INT_ID") REFERENCES "MCDONAC"."TAB2" ("ID") ENABLE

PL/SQL procedure successfully completed.

The death of UTL_FILE

In a previous post I covered a technique to improve the performance of UTL_FILE, but concluded the post with a teaser: “you probably don’t need to use UTL_FILE ever again”.

image

Time for me to back that statement up with some concrete evidence.

UTL_FILE can read and write files. This blog post will cover the writing functionality of UTL_FILE and why I think you probably don’t need UTL_FILE for this. I’ll come back to UTL_FILE to read files in a future post.

There are two possibilities when it comes to writing a file:

  • The file is being requested by a client program and the results should be stored on a client machine. In that case, you do not use UTL_FILE anyway – you use the tools available on the client, for example, SQL Developer, SQLcl, SQL*Plus or a myriad of other tools, OR
  • the file is to be written by the database server to a location on the database server itself, or to a location that is accessible to the database server. This is where UTL_FILE has been used, the main motivations being that it can be part of existing database-resident PL/SQL code, and does not require clients to have direct SQL or OS level access to the database server.

So here is my assertion – most of the time, UTL_FILE is used to write out a file because we can’t make use an easier tool like those mentioned above directly on the database server.  After all, who wants to write code like:


SQL> declare
  2    f utl_file.file_type;
  3    line varchar2(255);
  4    s timestamp;
  5  begin
  6    f := utl_file.fopen('TMP','demo.dat','w');
  7    s := systimestamp;
  8    for i in ( select * from t )
  9    loop
 10      line := i.object_id||','||i.object_name||','||i.object_type||','||i.data_object_id;
 11      utl_file.put_line(f,line);
 12    end loop;
 13    dbms_output.put_line(systimestamp-s);
 14    utl_file.fclose_all;
 15  end;
 16  /

PL/SQL procedure successfully completed.

when most of the time if I was doing this in a tool it would be simply:


SQL> set markup csv on
SQL> spool c:\tmp\demo.dat
SQL> select * from t;
SQL> spool off

But in recent versions of the database, you can do exactly this! The database scheduler has been enhanced to be able to to run SQL*Plus style scripts directly out of the database without needing to give OS access to database server, or the SQL*Plus executable. In the previous post I unloaded 40million rows to a file in CSV format and the responsibility for formatting the data into CSV format fell to me – I had to do all of the “heavy lifting”. Using the scheduler and the SQL_SCRIPT job type, it is as simple as writing a SQL*Plus script, and submitting it as a job.


SQL> declare
  2    l_script   VARCHAR2(32767) :=
  3  'conn /@db
  4  set markup csv on
  5  set arraysize 500
  6  set pages 0
  7  set lines 200
  8  set trimspool on
  9  spool c:\tmp\demo_sched.dat
 10  select * from t;
 11  spool off';
 12  begin
 13    dbms_scheduler.create_job(
 14      job_name        => 'UNLOAD_DATA',
 15      job_type        => 'SQL_SCRIPT',
 16      job_action      => l_script,
 17      credential_name => 'MY_ACCOUNT',
 18      enabled         => true
 19    );
 20  end;
 21  /

PL/SQL procedure successfully completed.

You will want to ensure that you have some controls over the usage of this feature, and what credentials the scripts will run under.  Also in my example, I’ve got a connection wallet setup so that I do not have to code any passwords into the connection string for my scheduler job. But suddenly it has become easy to get access to the scripting tools we are used to on our own client machines, and utilize them on the database server.

“Slow down there cowboy…” I hear you exclaim. “…That might be fine for simple SELECT * scripts, but my UTL_FILE procedures have a lot of complex logic to construct the file data”.

Even if you are using UTL_FILE because you are performing some complicated algorithms to generate the data that will be written to file, you can still utilize this scheduler facility.  After your complicated logic is completed, ultimately you typically will have a line of data you need to write to a file.  And if you have a line of data, then it is trivial to port that procedure to become a pipelined function.  And once you have a pipelined function, then we have a simple query mechanism that can be used to spool the output.  For example, if my original procedure is:


SQL> create or replace
  2  procedure my_procedure is
  3  begin
  4    for each_row in ( [some set] )
  5    loop
  6    ...
  7    ... my complex logic
  8    ...
  9
 10    ...
 11    ... writing each line to file with UTL_FILE
 12    ...
 13    end loop;
 14  end;
 15  /

then we change none of the logic – we only need replace all of that messy UTL_FILE code with a simple pipe command to allow querying that function as if it was source of rows, and then spool it to a file in the usual way.


SQL> create or replace
  2  function my_function return sys.odcivarchar2list pipelined is
  3  begin
  4    for each_row in ( [some set] )
  5    loop
  6    ...
  7    ... my complex logic
  8    ...
  9
 10    ...
 11    ... pipe row ( the_line );
 12    ...
 13    end loop;
 14    return;
 15  end;
 16  /

SQL> spool c:\tmp\demo.dat
SQL> select * from my_function();
SQL> spool off

So next time you’re cranking out some cumbersome UTL_FILE code to write a file, take a moment to see if the scheduler can look after some of the work for you. I’ll talk about (not) using UTL_FILE to read files in a future post.

When WHEN went faster

Yeah…try saying that blog post title 10 times in a row as fast as you can Smile

But since we’re talking about doing things fast, this is just a quick post about a conversation I had a twitter yesterday about the WHEN clause in a trigger.

 

image

That is an easy benchmark to whip up – I just need a couple of tables, each with a simple a trigger differing only by their usage of the WHEN clause.  Here is my setup:


SQL> create table t1 ( x int );

Table created.

SQL> create table t2 ( x int );

Table created.

SQL> create or replace
  2  trigger TRG1
  3  before insert on t1
  4  for each row
  5  begin
  6    if sys_context('USERENV','SID') = 0 then
  7       :new.x := 0;
  8    end if;
  9  end;
 10  /

Trigger created.

SQL> create or replace
  2  trigger TRG2
  3  before insert on t2
  4  for each row
  5  when ( sys_context('USERENV','SID') = 0 )
  6  begin
  7     :new.x := 0;
  8  end;
  9  /

Trigger created.

I’m using an always false condition so the content of the trigger will never be fired (except for the WHEN clause and the IF-test).  Now I’m ready to slam a bunch of inserts into each table and measure the performance.


SQL> set timing on
SQL> insert into t1
  2  select 1 from dual
  3  connect by level <= 1000000;

1000000 rows created.

Elapsed: 00:00:02.52

SQL> insert into t2
  2  select 1 from dual
  3  connect by level <= 1000000;

1000000 rows created.

Elapsed: 00:00:00.41

There is an immediate urge here to go screaming from the roof tops that WHEN is the all conquering hero of this benchmark. And yes, it is indeed faster but make sure you keep the results in perspective.  We gained back about 2 seconds for 1 million inserts. For me, once I’m inserting millions of rows into a table I’m starting to think not about trigger performance but whether this is the kind of table I want to be having triggers on at all.