More triggers are better

Yes, you heard me correctly. If you have got one trigger on a table, then you might be surprised to find that perhaps having a second one will be a better option. Then again, I also love the sweet scent of a clickbaity, inflammatory blog post title to draw the readers in Smile so you’ll just have to read on to see which is true.

As much as I am not a fan of using triggers to populate metadata in tables, I also concede that it is a popular approach taken by many of us as we code up our applications. Sure, we’d like to have our application code store that all important “who did this change” information with each database row, but more often than not, this get shoe-horned into the codebase via triggers as an afterthought to the development process.

But our well-read developer will often be thinking: “I’m a savvy developer – I’ve been reading blogs, and websites that tell me that less triggers is better, so I’ll use as few as I can” which leads to a scenario like the one I’ll present now.

I’ll start with a standard parent and child table setup, connected via an obvious foreign key


SQL> create table par (
  2    id           number(10)    not null primary key,
  3    description  varchar2(20)  not null,
  4    created_by   varchar2(20)  not null,
  5    updated_by   varchar2(20)  not null
  6    );

Table created.

SQL>
SQL> create sequence par_seq;

Sequence created.

SQL>
SQL> create sequence chd_seq;

Sequence created.

SQL>
SQL> create table chd (
  2    id           number(10)    not null primary key,
  3    par_id       number(10)    not null,
  4    constraint  fk foreign key ( par_id ) references par ( id )
  5    );

Table created.

SQL>
SQL>
SQL> create index chd_fk_idx on chd (par_id);

Index created.

I’ve got a sequence for each table, so in order to “bind” that sequence to the primary key for each table I’ll create a trigger.  So I’ll add some trigger code to implement some other common application requirements at the same time:

  • Populate the primary key with a sequence value,
  • Populate the CREATED_BY, UPDATED_BY columns on insert of a new row
  • Amend the UPDATED_BY column when I later update that row

Here is a trigger to implement that.


SQL> create or replace trigger trg_par
  2  before insert or update on par
  3  for each row
  4  begin
  5      if inserting then
  6         :new.id := par_seq.nextval;
  7         :new.created_by := user;
  8         :new.updated_by := user;
  9      end if;
 10
 11      if updating then
 12         :new.updated_by := user;
 13      end if;
 14  end;
 15  /

Trigger created.

Now my application is ready go.  I’ll insert some data to verify that my trigger has not broken any functionality.


SQL> insert into par (description) values ('test');

1 row created.

SQL> insert into par (description) values ('test2');

1 row created.

SQL> select * from par;

  ID DESCRIPTION          CREATED_BY           UPDATED_BY
---- -------------------- -------------------- --------------------
   1 test                 MCDONAC              MCDONAC
   2 test2                MCDONAC              MCDONAC

2 rows selected.

SQL>
SQL> insert into chd (id,par_id) values (chd_seq.nextval,1);

1 row created.

SQL> insert into chd (id,par_id) values (chd_seq.nextval,2);

1 row created.

SQL> select * from chd;

  ID     PAR_ID
---- ----------
   1          1
   2          2

2 rows selected.

SQL>
SQL> commit;

Commit complete.

So far so good. Let us see now what happens when I do a simple update on the parent table on the DESCRIPTION column.  Note that this column is not involved with any primary key, or index or foreign key relationship – it’s just a simple string column.


SQL> update par set description = 'anything' where id = 1;

1 row updated.

SQL> select locked_mode, object_name
  2  from v$locked_object l, all_objects ob
  3  where ob.object_id =l.object_id;

LOCKED_MODE OBJECT_NAME
----------- ----------------------------------------
          3 PAR
          3 CHD

2 rows selected.

Notice the locks that have been taken. Understandably, I had to take a lock on the PAR table because I’ve just updated a row on the table, so I need to make sure that no-one does anything like drop it, or modify the structure whilst I’ve got an outstanding transaction. But here is the interesting part – we also took a lock on the child table CHD. That seems superfluous because as I said – we are not performing an update of anything that could possibly impact the child table in any way. 

Why did the database choose to add this extra lock?

The cause is due to the way I coded the trigger.  Notice that the trigger has a reference to the primary key in the trigger body:


  5      if inserting then
  6         :new.id := par_seq.nextval;

Even though that part of the trigger body is not executed (because I am performing an update not an insert) the mere presence of it is enough for the database to cautiously add that extra lock on the CHD table because we “just might” be messing around with the parent table primary key here.

So now I’ll roll back that update and convert the single triggers to two triggers – one for insert and one dedicated for update.


SQL> rollback;

Rollback complete.

SQL>
SQL> drop trigger trg_par ;

Trigger dropped.

SQL>
SQL> create or replace trigger trg1_par
  2  before insert on par
  3  for each row
  4  begin
  5      :new.id := par_seq.nextval;
  6      :new.created_by := user;
  7      :new.updated_by := user;
  8  end;
  9  /

Trigger created.

SQL>
SQL> create or replace trigger trg2_par
  2  before update on par
  3  for each row
  4  begin
  5      if updating then
  6          :new.updated_by := user;
  7      end if;
  8  end;
  9  /

Trigger created.

Now I’ll run the same update


SQL> update par set description = 'anything' where id = 1;

1 row updated.

SQL> select locked_mode, object_name
  2  from v$locked_object l, all_objects ob
  3  where ob.object_id =l.object_id;

LOCKED_MODE OBJECT_NAME
----------- ----------------------------------------
          3 PAR

1 row selected.

Because the reference to the primary key columns for table PAR are no longer in the firing (update) trigger, I no longer longer see the lock on the CHD table.

In practice, it is unlikely that this additional lock is going to cause you a lot of harm – it will block some operations on the child table such as a truncate or DDL to make a structural change, so it is reasonably to assume that these are rare occurrences. But I’m a fan of the philosophy of: Have as few locks as possible, but always as many as required. So in this case, you might to want to consider opting for two triggers instead of one.

And of course, perhaps getting to zero triggers might be the best option Smile

Concurrency … the path to success and the path the failure

Let’s face it. Concurrency is a good thing when it comes to database applications. After all, if there is only a single user of your application, then chances are, it is not a successful application Smile.  Of course there are exceptions to this rule, but by and large, most of the applications we build are going to be used by large populations of users.  And given the recent publicity of users data and privacy, we can also be pretty confident that we want the data in our applications to be correct at all times.

As a developer, it is absolutely critical to keep concurrency in mind when building applications.  Here is a simple demonstration to thrust home the point. 

The task assigned to the developer here is simple – transfer move all of the rows satisfying a particular condition from table T1 to table T2.  Let’s create our database objects first:


SQL> create table t1 ( x int, y int, z int );

Table created.

SQL> create table t2 as select * from t1 where 1=0;

Table created.

SQL>
SQL> insert into t1
  2  select rownum, rownum, rownum
  3  from dual connect by level <= 10000;

10000 rows created.

SQL> commit;

Commit complete.

So table T1 has 10,000 rows and T2 is empty, and our condition to determine which rows to transfer will be X being non-zero.  That happens to be all of the rows, but of course, it could be any filtering rule on the existing data. I’ve chosen all of the rows so I can keep the numbers easy to digest in the upcoming examples.

There is no such thing as a MOVE command in SQL, so we can implement this with two statements: 

  • INSERT-SELECT to copy the rows from T1 to T2
  • DELETE to remove the rows we just copied.

Here is the simple PLSQL routine to do that, and a first test run of the code.


SQL>
SQL> set serverout on
SQL> begin
  2      insert into t2 select * from t1 where x != 0;
  3      delete from t1 where x != 0;
  4      commit;
  5  end;
  6  /

PL/SQL procedure successfully completed.

SQL>
SQL> select count(*) from t1;

  COUNT(*)
----------
         0

1 row selected.

SQL> select count(*) from t2;

  COUNT(*)
----------
     10000

1 row selected.

SQL>

Everything looks fine doesn’t it? But it’s not.  In fact, I’ll do what any good developer should be doing, and add some instrumentation to the code to get more information about its operations.


SQL> create table t1 ( x int, y int, z int );

Table created.

SQL> create table t2 as select * from t1 where 1=0;

Table created.

SQL>
SQL> insert into t1
  2  select rownum, rownum, rownum
  3  from dual connect by level <= 10000;

10000 rows created.

SQL> commit;

Commit complete.

SQL>
SQL> set serverout on
SQL> begin
  2      insert into t2 select * from t1 where x != 0;
  3      dbms_output.put_line(sql%rowcount);           
  4      delete from t1 where x != 0;
  5      dbms_output.put_line(sql%rowcount);
  6      commit;
  7  end;
  8  /
10000
10100      <=========== !!!!!

PL/SQL procedure successfully completed.

SQL>
SQL> select count(*) from t1;

  COUNT(*)
----------
         0

1 row selected.

SQL> select count(*) from t2;

  COUNT(*)
----------
     10000

1 row selected.

SQL>
SQL>

Look at the output from the PL/SQL block.  I copied(inserted ) 10,000 rows from T1 to T2, but the deletion phase removed more than 10,000 rows. How was that possible?  It is because databases are about concurrency.  In another session, I had a simple script running that just adds new rows to the table continuously.  As a developer, it is vital that you do not assume that you are the only person accessing or manipulating a table.  My code in its existing form has corrupted the database, because 100 rows have been deleted that were never copied to T2.  They’re gone forever!

So….onto some potential solutions.  Lets try locking the rows first


SQL> create table t1 ( x int, y int, z int );

Table created.

SQL> create table t2 as select * from t1 where 1=0;

Table created.

SQL>
SQL> insert into t1
  2  select rownum, rownum, rownum
  3  from dual connect by level <= 10000;

10000 rows created.

SQL> commit;

Commit complete.

SQL> set serverout on
SQL> declare
  2    cursor c is select * from t1 where x != 0 FOR UPDATE;
  3  begin
  4    open c;
  5    insert into t2 select * from t1 where x != 0;
  6    dbms_output.put_line(sql%rowcount);               
  7    delete from t1 where x != 0;
  8    dbms_output.put_line(sql%rowcount);
  9    commit;
 10    close c;
 11  end;
 12  /
10000
10100

PL/SQL procedure successfully completed.

SQL>
SQL> select count(*) from t1;

  COUNT(*)
----------
         0

1 row selected.

SQL> select count(*) from t2;

  COUNT(*)
----------
     10000

1 row selected.

As you can see, that has not yielded any benefit because locking is only of benefit for locking existing rows.  The new rows arriving from a separate session are unhindered, and hence my DELETE statement still picks up the new rows when it should not. I could workaround this by locking the entire table, but what if my filtering condition was only a small percentage of the rows?  It seems overly aggressive to lock everyone out of the table just to copy a small subset of rows.  In this case, I can use PL/SQL to help me out. Using the BULK COLLECT / BULK BIND facilities, I can restrict my DML operations to just the ROWIDs of the rows I am copying. 


SQL> create table t1 ( x int, y int, z int );

Table created.

SQL> create table t2 as select * from t1 where 1=0;

Table created.

SQL>
SQL> insert into t1
  2  select rownum, rownum, rownum
  3  from dual connect by level <= 10000;

10000 rows created.

SQL> commit;

Commit complete.

SQL> set serverout on
SQL> declare
  2    cursor c is select rowid rid, t1.* from t1 where x != 0 for update;
  3    type array is table of c%rowtype index by pls_integer;
  4    l_rows array;
  5    l_idx int;
  6  begin
  7    open c;
  8    loop
  9      fetch c bulk collect into l_rows limit 20000;
 10      exit when l_rows.count = 0;
 11
 12      forall i in 1 .. l_rows.count
 13         insert into t2 values (
 14            l_rows(i).x,
 15            l_rows(i).y,
 16            l_rows(i).z );
 17      dbms_output.put_line(sql%rowcount);       
 18      forall i in 1 .. l_rows.count
 19        delete from t1
 20        where rowid = l_rows(i).rid;
 21      dbms_output.put_line(sql%rowcount);
 22    end loop;
 23  end;
 24  /
10000
10000

PL/SQL procedure successfully completed.

SQL>
SQL>
SQL> select count(*) from t1;

  COUNT(*)
----------
       147

1 row selected.

SQL> select count(*) from t2;

  COUNT(*)
----------
     10000

1 row selected.

I am collecting the 10000 ROWID’s of interest into an array, and then deleting only those rows.  So even though new rows are arriving, I will never see them because they will not be in the set of ROWIDs that I am moving to table T2.  We can see that during the operation, 147 new rows arrived in T1 from other sessions.

(Note:  Normally you would choose a bulk collect size of perhaps 1000 or similar, and loop around repeated to avoid excessive PGA use.  I bulk collected the entire set to show the rowcounts more intuitively)

You might be thinking that perhaps we no longer need the “FOR UPDATE” clause on the cursor, but it is still a required element.  Without it, it is possible that a different session could update a row from the candidate data set, and that updated row would still be deleted.  Or a different session could even delete, commit, and insert a fresh row that happens to pick up the same ROWID as one from the original set.

Bottom Line – keep concurrency front and foremost in your mind when you are writing code to manipulate data.

Locked rows and lunch breaks ? A simple fix

Even in the new world of stateless web applications, from time to time we still come across the “lunch break locking” problem.  That is, someone or something, commences a transaction which locks some rows, and then their session goes idle, without committing or rolling back the transaction.  The metaphor commonly used was the “someone locks a row in a table and then goes out for lunch”, with them being totally oblivious to the havoc they have caused until they return.

With modern applications being generally stateless (or transactionally stateless) you would think the problem would have disappeared, but if anything, whilst the frequency of the problem has dropped, the diagnosis of the problem when it occurs is harder than ever.  The anonymity of application server sessions, and the scale of web-based customers means when a session does “lose track” of what it was doing and leaves a transaction active, it is very hard to detect until we have a major problem or outage on our hands.

But here’s a mitigation strategy you might want to consider.  Resource Manager can do some of the heavy lifting for you.  It can kill sessions that are idle but are blocking other sessions from completing their work.

Here’s a simple demo of how to put a resource manager plan in place to do that


SQL> begin
  2    dbms_resource_manager.create_pending_area();
  3    --
  4
  5    dbms_resource_manager.create_consumer_group(
  6      CONSUMER_GROUP=>'CG_STOP_BLOCKERS',
  7      COMMENT=>'CG for stop blocking'
  8      );
  9
 10    dbms_resource_manager.create_plan(
 11      PLAN=> 'STOP_BLOCKERS',
 12      COMMENT=>'Plan for stop blocking'
 13    );
 14
 15    dbms_resource_manager.create_plan_directive(
 16      PLAN=> 'STOP_BLOCKERS',
 17      GROUP_OR_SUBPLAN=>'CG_STOP_BLOCKERS',
 18      COMMENT=>'Directive',
 19      MAX_IDLE_BLOCKER_TIME => 60
 20    );
 21
 22
 23    dbms_resource_manager.create_plan_directive(
 24      PLAN=> 'STOP_BLOCKERS',
 25      GROUP_OR_SUBPLAN=>'OTHER_GROUPS',
 26      COMMENT=>'leave others alone'
 27    );
 28
 29    DBMS_RESOURCE_MANAGER.VALIDATE_PENDING_AREA;
 30
 31    DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA();
 32
 33  end;
 34  /

PL/SQL procedure successfully completed.

The key element here is the plan directive, which sets MAX_IDLE_BLOCKER_TIME to 60 seconds. So if an idle session is blocking another for that duration, resource manage will kill the session. Let’s assign the the appropriate consumer group to SCOTT, enable the plan and see it in action


SQL> exec dbms_resource_manager_privs.grant_switch_consumer_group('SCOTT','CG_STOP_BLOCKERS',false);

PL/SQL procedure successfully completed.

SQL> exec dbms_resource_manager.set_initial_consumer_group('SCOTT','CG_STOP_BLOCKERS');

PL/SQL procedure successfully completed.

SQL> ALTER SYSTEM SET RESOURCE_MANAGER_PLAN = STOP_BLOCKERS;

System altered.

SQL> conn scott/tiger
Connected.

SQL> create table t as select 1 x from dual;

Table created.

SQL> delete from t;

1 row deleted.

So this session now has a lock on the row, but is now idle…Now we fire up another session who will get blocked


SQL> set timing on
SQL> delete from t;

[waiting]

After approximately 60 seconds, the second session did in fact respond


SQL> set timing on
SQL> delete from t;

1 row deleted.

Elapsed: 00:00:58.09

So presumably the lock from session 1 has been released. If we return to session 1, we can see how the lock get released


SQL> select * from dual;
select * from dual
            *
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 24708
Session ID: 593 Serial number: 59128


They got killed ! So they can take their time at lunch, and come back with their double-choc muffins, chai latte and watercress salad…We don’t care, because our application is protected Smile

Forever is a long time

It’s interesting that we still see lots of code out “in the wild” that leans itself to taking forever. Now I’m not talking here about poorly performing SQL etc…because (at least theoretically) no matter how bad the SQL is, presumably it will finish one day, even if that day is years into the future. What I’m referring to here is the default mechanism for locking in Oracle.

If you try access a row that is locked by someone else, you will wait. Don’t get me wrong … that’s a very good thing. The alternative – of letting two people change the same data at the same time is equivalent to saying “I dont care about my data at all”.

And how long will we wait ? Well… we’ll wait forever. We will never stop waiting, until that lock has been released.

I don’t know about you … but for me… forever is long time. A really long time. I cannot recall ever seeing any kind of Service Level Agreement for an application’s response times as stating “Yup…forever is cool. Take as long as you want” Smile

So perhaps consider that when you’re building your applications in Oracle. Yes, the default is to wait forever, but a little code (which has been available since version 9) can really go a long way.

Rather than just coding:

select * from T where …for update

or

delete from T where …

You can opt to wait for a nominated amount of time to attempt to get that lock (and then report something sensible back to the user).


SQL> select * from T for update wait 60;

(60 seconds pass)

select * from T for update wait 60
*
ERROR at line 1:
ORA-30006: resource busy; acquire with WAIT timeout expired

Note that the error code is not the same as the error you could get with a NOWAIT clause, which is:

ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired

But you don’t even need to stop there. Why not catch that error with a SERVERERROR trigger – and then do some session level probing to get more information. You could even empower the users to take their own action 🙂


SQL> create or replace trigger lock_expiry
  2  after servererror on database
  3  declare
  4    l_err varchar2(4000);
  5  begin
  6    if ( is_servererror(30006) ) then
  7      for i in (
  8      select /*+ leading(lk sess) */ sess.sid, sess.username, sess.module
  9      from   v$lock lk,
 10             v$session sess,
 11             v$session s
 12      where  lk.id1 = s.row_wait_obj#
 13      and    lk.type = 'TM'
 14      and    lk.sid = sess.sid
 15      and    s.sid = sys_context('USERENV','SID')
 16      )
 17      loop
 18        l_err := l_err || 'Session '||i.sid||' who is '||i.username||' using '||i.module || chr(10);
 19      end loop;
 20      raise_application_error(-20000,l_err);
 21    end if;
 22  end;
 23  /

Trigger created.

SQL> select * from t for update wait 60;
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-20000: Session 717 who is MCDONAC using SQL*Plus
ORA-06512: at line 18
ORA-30006: resource busy; acquire with WAIT timeout expired

Just for completeness, note that if you having locking issue, V$SESSION gives you direct access to find out who is blocking you.


SQL> select sid, last_call_et,
  2         nvl2(lockwait,'BLOCKED',status) status,
  3         blocking_session
  4  from   v$session s;

       SID LAST_CALL_ET STATUS     BLOCKING_SESSION 
---------- ------------ ---------- ---------------- 
        39          376 INACTIVE                    
        40          412 INACTIVE
        41          412 INACTIVE
        44          421 INACTIVE
        46          340 BLOCKED                  39 
        49            4 ACTIVE
        50          453 INACTIVE
        51          453 INACTIVE


can you lock part of a table ?

I should stress that this is all well covered in the documentation, but it was something I stumbled upon just recently.

Its common knowledge that you can lock a table:

SQL> create table T 
  2   ( x int ) 
  3  partition by range ( x ) 
  4  ( 
  5    partition p1 values less than (10), 
  6    partition p2 values less than (20), 
  7    partition p3 values less than (30) 
  8  );

Table created.

SQL> lock table T in exclusive mode;

Table(s) Locked.

However, you can also selectively lock just partitions of a table as well, using the standard partition clause, for example:

Session 1

SQL> lock table T partition ( p1 ) in exclusive mode;

Table(s) Locked.

whilst the following is still possible in session 2:

SQL> insert into T values (20);

1 row created.