Even in the new world of stateless web applications, from time to time we still come across the “lunch break locking” problem. That is, someone or something, commences a transaction which locks some rows, and then their session goes idle, without committing or rolling back the transaction. The metaphor commonly used was the “someone locks a row in a table and then goes out for lunch”, with them being totally oblivious to the havoc they have caused until they return.
With modern applications being generally stateless (or transactionally stateless) you would think the problem would have disappeared, but if anything, whilst the frequency of the problem has dropped, the diagnosis of the problem when it occurs is harder than ever. The anonymity of application server sessions, and the scale of web-based customers means when a session does “lose track” of what it was doing and leaves a transaction active, it is very hard to detect until we have a major problem or outage on our hands.
But here’s a mitigation strategy you might want to consider. Resource Manager can do some of the heavy lifting for you. It can kill sessions that are idle but are blocking other sessions from completing their work.
Here’s a simple demo of how to put a resource manager plan in place to do that
SQL> begin 2 dbms_resource_manager.create_pending_area(); 3 -- 4 5 dbms_resource_manager.create_consumer_group( 6 CONSUMER_GROUP=>'CG_STOP_BLOCKERS', 7 COMMENT=>'CG for stop blocking' 8 ); 9 10 dbms_resource_manager.create_plan( 11 PLAN=> 'STOP_BLOCKERS', 12 COMMENT=>'Plan for stop blocking' 13 ); 14 15 dbms_resource_manager.create_plan_directive( 16 PLAN=> 'STOP_BLOCKERS', 17 GROUP_OR_SUBPLAN=>'CG_STOP_BLOCKERS', 18 COMMENT=>'Directive', 19 MAX_IDLE_BLOCKER_TIME => 60 20 ); 21 22 23 dbms_resource_manager.create_plan_directive( 24 PLAN=> 'STOP_BLOCKERS', 25 GROUP_OR_SUBPLAN=>'OTHER_GROUPS', 26 COMMENT=>'leave others alone' 27 ); 28 29 DBMS_RESOURCE_MANAGER.VALIDATE_PENDING_AREA; 30 31 DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA(); 32 33 end; 34 / PL/SQL procedure successfully completed.
The key element here is the plan directive, which sets MAX_IDLE_BLOCKER_TIME to 60 seconds. So if an idle session is blocking another for that duration, resource manage will kill the session. Let’s assign the the appropriate consumer group to SCOTT, enable the plan and see it in action
SQL> exec dbms_resource_manager_privs.grant_switch_consumer_group('SCOTT','CG_STOP_BLOCKERS',false); PL/SQL procedure successfully completed. SQL> exec dbms_resource_manager.set_initial_consumer_group('SCOTT','CG_STOP_BLOCKERS'); PL/SQL procedure successfully completed. SQL> ALTER SYSTEM SET RESOURCE_MANAGER_PLAN = STOP_BLOCKERS; System altered. SQL> conn scott/tiger Connected. SQL> create table t as select 1 x from dual; Table created. SQL> delete from t; 1 row deleted.
So this session now has a lock on the row, but is now idle…Now we fire up another session who will get blocked
SQL> set timing on SQL> delete from t; [waiting]
After approximately 60 seconds, the second session did in fact respond
SQL> set timing on SQL> delete from t; 1 row deleted. Elapsed: 00:00:58.09
So presumably the lock from session 1 has been released. If we return to session 1, we can see how the lock get released
SQL> select * from dual; select * from dual * ERROR at line 1: ORA-03113: end-of-file on communication channel Process ID: 24708 Session ID: 593 Serial number: 59128
They got killed ! So they can take their time at lunch, and come back with their double-choc muffins, chai latte and watercress salad…We don’t care, because our application is protected
If all else fails bring out the horse’s head!
one of the problems that I regularly see at customer sites is that at least up to and including 126.96.36.199 the sessions get only “killed” using the regular internal “kill” procedure, no matter if killed via Resource Manager or traditional profiles idle time rules.
Although this frees up locks on transaction level etc., it doesn’t free up Parallel Slaves allocated to cursors that are still open (think of GUI tools like SQLDeveloper and only the first n rows fetched before leaving for lunch).
Hence these customers do have the problem that they might have many more allocated Parallel Slaves than necessary with all kinds of possible side effects (downgrades, or queuing with new Auto DOP).
These Parallel Slaves are only freed up when the killed session gets the confirmation “your session has been killed”, which requires another interaction of that client. As long as this doesn’t happen the Parallel Slaves are not freed up.
I notice now above that you get a “ORA-03113”, and not the “ORA-00028 your session has been killed”, so has this behaviour maybe changed in 12c, meaning a “hard” disconnect instead of graceful “kill”, which would be great, in the hope that Parallel Slaves are now also freed up as soon as the session gets killed.
Thanks for stopping by Randolf. I have to admit, I’ve never look at this with parallel in mind. I do wish that in SQL Dev that if a query was run in parallel then (by default) all rows are returned, because the concept of incremental fetch strikes me as counter intuitive to running something in parallel.
Thanks for this.
Very useful for one of our databases where a state-of-the-art (NOT!) Java application (could it possibly be anything else?) is hell bent on randomly creating row lock contention almost every day…
This is something I didn’t know about, and I think will be really useful. Thank you.
Only comment is that ‘set_initial_consumer_group’ is deprecated and ‘set_consumer_group_mapping’ is a much more flexible way of putting a session into a consumer group.
I appreciate that it’s not as simple for a demonstration though.
Thanks for the feedback Phil. I have to admit – it was laziness to use “set_initial_consumer_group” because I just re-hashed an old script for the demo. So it was totally valid to call me out on that 🙂
Reblogged this on Paolocastle's Blog and commented:
Thank you Connor McDonald , I finally found a very good reason to start using resource manager !
I like the idea of this, however unfortunately some applications such as Oracle E-Business Suite attempt to acquire locks with nowait specified, then they catch the exception and report to the user that the record is locked, rather than sitting and waiting on a lock. So you never actually see blocking locks on the database.
This is something I was looking for a longtime and never had time to put a fix to it even manually, let alone using resource manager. Thank you, this nice and neat demo made my day.
We do have a fairly large complex ERP system and I was wondering how I can implement this using resource manager to the minimum possible. In other words, while I want to implement the above solution – I do want to keep the scope as limited as possible in terms of implementing it.
Is there a way to assign the consumer group to all the users that way – I am implementing this feature globally, not by adding each user to the group?
Thank for your post, very useful.
I have one question, are there a log to see which session had blocked and a what time? Exist a view or log to check it?
Regards in advanced.
If you are looking for something has occurred in the past, then v$active_session_history is probably your best bet.
Thank you for your fast response,
I configured as you posted the Resource Manager “CG_STOP_BLOCKERS” and one session is blocked other, and the resource manager kill the blocker session; in that view “v$active_session_history” I can see the session blocked that was killed?
Thank you very much for your help.
The session that was blocked will have entries in ASH, and the BLOCKING_… columns should be populated to tell you *who* did the blocking.
Thank you for your support.