For almost as long as I can remember, the optimizer has had a nifty little trick when you (in effect) try to combine two different usage models within a single SQL. To explain that, I’m referring to the common scenario of: “If a bind value is provided by the caller, then use it, otherwise it shouldn’t limit the result set. So we commonly see queries like:
select *
from MY_TABLE
where COL1 = NVL(:mybindvar, COL1)
[For ease of discussion, we’ll assume COL1 is not nullable]
Anyway, the nice little optimizer trick was to optimize the query to handle the two separate use cases, so you see a CONCATENATION step in the execution plan, and two FILTER’s, one to handle the case when the bind variable is null, and one to handle the case where it is provided.
-----------------------------------------------
| Id | Operation |
-----------------------------------------------
| 0 | SELECT STATEMENT |
| 1 | CONCATENATION |
|* 2 | FILTER |
|* 5 | FILTER |
-----------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(:B1 IS NULL)
5 - filter(:B1 IS NOT NULL)
In effect, the optimizer has taken care of the old SQL tuning advice we used to give to the developers to rewrite the SQL as a UNION ALL, or even split it into two distinct SQL statements to cater for each case. With that in mind, I picked up a change to this behaviour in 12.1.0.2 (and some additional work by Jonathan Lewis suggests 11.2.0.4 as well), where bind peeking seems to create some confusion.
Let’s look at an example. I’ll create a table, populate it with approx 4million rows, where two cols are of interest:
- SEQ, 2m distinct values and hence highly selective
- PSEQ, only 2 distinct values, and hence not a great option for an index path
The primary key is the composite of these two columns, and a secondary index on PSEQ
SQL> create table T (
2 seq int, pseq int, blah char(30),
3 constraint T_PK primary key (seq,pseq)
4 )
5 /
Table created.
SQL> insert into T
2 select trunc(rownum/2) seq,
3 mod(rownum,2) pseq,
4 'x' blah
5 from
6 ( select 1 from dual connect by level < 1000 ),
7 ( select 1 from dual connect by level < 4000 )
8 /
3995001 rows created.
SQL> create index T_IX2 on T ( pseq );
Index created.
SQL> exec dbms_stats.gather_table_stats('','T',no_invalidate=>false);
PL/SQL procedure successfully completed.
Now we’ll execute an SQL in the form previously mentioned, and take the case where the bind variable in the NVL is null.
SQL> variable b1 number
SQL> variable b2 number
SQL> exec :b1 := null;
SQL> exec :b2 := 1706496;
SQL> select /*+ gather_plan_statistics */ *
2 from T
3 WHERE seq = :B2
4 AND pseq = NVL(:B1 ,pseq)
5 /
SEQ PSEQ BLAH
---------- ---------- ------------------------------
1706496 0 x
1706496 1 x
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID 0caq50rvfkub5, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ * from T WHERE seq = :B2 AND pseq
= NVL(:B1 ,pseq)
Plan hash value: 3837764478
---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 2 |00:00:00.01 | 6 |
| 1 | CONCATENATION | | 1 | | 2 |00:00:00.01 | 6 |
|* 2 | FILTER | | 1 | | 2 |00:00:00.01 | 6 |
| 3 | TABLE ACCESS BY INDEX ROWID BATCHED| T | 1 | 2 | 2 |00:00:00.01 | 6 |
|* 4 | INDEX RANGE SCAN | T_PK | 1 | 2 | 2 |00:00:00.01 | 4 |
|* 5 | FILTER | | 1 | | 0 |00:00:00.01 | 0 |
|* 6 | TABLE ACCESS BY INDEX ROWID BATCHED| T | 0 | 1 | 0 |00:00:00.01 | 0 |
|* 7 | INDEX RANGE SCAN | T_IX2 | 0 | 1 | 0 |00:00:00.01 | 0 |
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(:B1 IS NULL)
4 - access("SEQ"=:B2)
filter("PSEQ" IS NOT NULL)
5 - filter(:B1 IS NOT NULL)
6 - filter("SEQ"=:B2)
7 - access("PSEQ"=:B1)
30 rows selected.
Because :B1 is null, you can see from the Predicate Information, and from the Actual Rows information, that we took the execution path in lines 3-4. But take a peek (no pun intended) at lines 6-7. That path, whilst not used, is proposing the use of index T_IX2, which as we know is 2 distinct keys across 4 million rows. The problem is … we’ve now loaded that execution plan into our library cache. So let’s see what happens when we exercise that part of the plan when we this time specify both bind variables
SQL> exec :b1 := 1;
SQL> select /*+ gather_plan_statistics */ *
2 from T
3 WHERE seq = :B2
4 AND pseq = NVL(:B1 ,pseq)
5 /
SEQ PSEQ BLAH
---------- ---------- ------------------------------
1706496 1 x
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID 0caq50rvfkub5, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ * from T WHERE seq = :B2 AND pseq
= NVL(:B1 ,pseq)
Plan hash value: 3837764478
------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.70 | 28124 | 178 |
| 1 | CONCATENATION | | 1 | | 1 |00:00:00.70 | 28124 | 178 |
|* 2 | FILTER | | 1 | | 0 |00:00:00.01 | 0 | 0 |
| 3 | TABLE ACCESS BY INDEX ROWID BATCHED| T | 0 | 2 | 0 |00:00:00.01 | 0 | 0 |
|* 4 | INDEX RANGE SCAN | T_PK | 0 | 2 | 0 |00:00:00.01 | 0 | 0 |
|* 5 | FILTER | | 1 | | 1 |00:00:00.70 | 28124 | 178 |
|* 6 | TABLE ACCESS BY INDEX ROWID BATCHED| T | 1 | 1 | 1 |00:00:00.70 | 28124 | 178 |
|* 7 | INDEX RANGE SCAN | T_IX2 | 1 | 1 | 1997K|00:00:00.63 | 3898 | 178 |
----------------------------------------------------------
Ouch…2000 million rows scanned in the index. Since we provided both bind variables, the obvious access path should have been a simple primary key lookup. But the path derived from our first execution (with :B1 being null) has left behind a “nasty legacy”.
The troubling thing about this (in terms of application stability) is that the order in which we run queries now impacts the performance of how they run. Lets clear out the cursor by recalculating stats and then reverse the order of execution.
SQL> exec dbms_stats.gather_table_stats('','T',no_invalidate=>false);
PL/SQL procedure successfully completed.
SQL> exec :b1 := 1;
SQL> exec :b2 := 1706496;
PL/SQL procedure successfully completed.
SQL> select /*+ gather_plan_statistics */ *
2 from T
3 WHERE seq = :B2
4 AND pseq = NVL(:B1 ,pseq)
5 /
SEQ PSEQ BLAH
---------- ---------- ------------------------------
1706496 1 x
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID 0caq50rvfkub5, child number 1
-------------------------------------
select /*+ gather_plan_statistics */ * from T WHERE seq = :B2 AND pseq
= NVL(:B1 ,pseq)
Plan hash value: 933468988
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 4 |
| 1 | CONCATENATION | | 1 | | 1 |00:00:00.01 | 4 |
|* 2 | FILTER | | 1 | | 0 |00:00:00.01 | 0 |
| 3 | TABLE ACCESS BY INDEX ROWID BATCHED| T | 0 | 2 | 0 |00:00:00.01 | 0 |
|* 4 | INDEX RANGE SCAN | T_PK | 0 | 2 | 0 |00:00:00.01 | 0 |
|* 5 | FILTER | | 1 | | 1 |00:00:00.01 | 4 |
| 6 | TABLE ACCESS BY INDEX ROWID | T | 1 | 1 | 1 |00:00:00.01 | 4 |
|* 7 | INDEX UNIQUE SCAN | T_PK | 1 | 1 | 1 |00:00:00.01 | 3 |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(:B1 IS NULL)
4 - access("SEQ"=:B2)
filter("PSEQ" IS NOT NULL)
5 - filter(:B1 IS NOT NULL)
7 - access("SEQ"=:B2 AND "PSEQ"=:B1)
When we optimize the query for the initial case of both bind variables provided, you can see that both sides of the CONCATENATION have yielded a sensible path.
Hi Connor.
I hate bind peeking…
I think it’s worth mentioning that at least Adaptive Cursor Sharing fixes this for later executions.
So actually repeating your test WITHOUT clearing out the cursor (by recalculating stats) will yield the same results (but due to different reasons). In this case, after the first two executions we’ll have a single child cursor with IS_BIND_AWARE=’N’ and IS_SHAREABLE=’Y’. After the third execution this child will have IS_SHAREABLE=’N’ and a second child will be created (with the good plan) with IS_BIND_AWARE=’Y’ and IS_SHAREABLE=’Y’.
Thanks,
Oren.
Hi Oren, just a question.
Does the BIND_AWARE hint resolve this problem in this kind of statements ?
Thanks,
Atta
Hi Connor & Oren,
I just happened to found this post while reading another one …
Regardless of the “optimizer trick” of turning the NVL condition into a CONCATENATION operation,
the best execution plan for BOTH branches should still have been the scanning by the T_PK index,
with the only difference that one branch would have an INDEX UNIQUE SCAN and the other an INDEX RANGE SCAN.
The column statistics for the two columns should have been sufficient for this decision even on the first execution, where bind peeking saw :B1 as NULL.
In this case, since the query contains the SEQ=:B2 condition and SEQ is the much more selective column,
the optimizer could have even bypassed the step of converting the NVL condition into a CONCATENATION.
If the query had been
SELECT …
WHERE seq = NVL(:B1, seq) AND pseq = :B2
then we have indeed a case for having different plans used for the two concatenation branches,
but, again, optimizing the two cases, I would have expected an INDEX UNIQUE SCAN on one branch
and a FULL TABLE SCAN on the other one.
Am I missing something here ?
Thanksa a lot & Best Regards,
Iudith Mentzel
Hi Iudith.
The sub-plan that the optimizer generates for the second branch of the CONCATENATION is for the case where :B1 IS NOT NULL, so the actual predicate is PSEQ=:B1.
In Connor’s first example, :B1 := null, so when the optimizer peeks at :B1 it “realizes” that the selectivity of PSEQ=:B1 will be 0. On the other hand, the selectivity of SEQ=:B2 is very small, but still more than 0. So using T_IX2 may seem better than using T_PK.
Thanks,
Oren.
Hi Connor.
The change in the behavior is not related to the “nifty little trick”.
Actually, this is a bug fix:
Bug 14127824 Incorrecty selectivity for peeked binds when they are NULL
We can repeat your test with a simpler example and observe the same results:
ORA$BASE> variable b1 number
ORA$BASE> variable b2 number
ORA$BASE> exec :b1 := null;
ORA$BASE> exec :b2 := 1706496;
ORA$BASE>
ORA$BASE> select /*+ gather_plan_statistics */ *
2 from T
3 WHERE seq = :B2
4 AND pseq = :B1
5 /
no rows selected
ORA$BASE> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID 86qkrwkq3mpcz, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ * from T WHERE seq = :B2 AND pseq = :B1
Plan hash value: 179449122
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.01 |
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T | 1 | 1 | 0 |00:00:00.01 |
|* 2 | INDEX RANGE SCAN | T_IX2 | 1 | 1 | 0 |00:00:00.01 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("SEQ"=:B2)
2 - access("PSEQ"=:B1)
21 rows selected.
ORA$BASE> exec dbms_stats.gather_table_stats('','T',no_invalidate=>false);
ORA$BASE> exec :b1 := 1;
PL/SQL procedure successfully completed.
ORA$BASE> select /*+ gather_plan_statistics */ *
2 from T
3 WHERE seq = :B2
4 AND pseq = :B1
5 /
SEQ PSEQ BLAH
---------- ---------- ------------------------------------------------------------------------------------------
1706496 1 x
ORA$BASE> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID 86qkrwkq3mpcz, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ * from T WHERE seq = :B2 AND pseq = :B1
Plan hash value: 1303508680
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 4 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 1 | 1 |00:00:00.01 | 4 |
|* 2 | INDEX UNIQUE SCAN | T_PK | 1 | 1 | 1 |00:00:00.01 | 3 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("SEQ"=:B2 AND "PSEQ"=:B1)
20 rows selected.
And setting optimizer_features_enable to versions before 11.2.0.4 produces the “bad old” plan:
ORA$BASE> alter session set optimizer_features_enable='11.2.0.3';
Session altered.
ORA$BASE> exec dbms_stats.gather_table_stats('','T',no_invalidate=>false);
PL/SQL procedure successfully completed.
ORA$BASE>
ORA$BASE> set serveroutput off
ORA$BASE> variable b1 number
ORA$BASE> variable b2 number
ORA$BASE> exec :b1 := null;
PL/SQL procedure successfully completed.
ORA$BASE> exec :b2 := 1706496;
PL/SQL procedure successfully completed.
ORA$BASE>
ORA$BASE> select /*+ gather_plan_statistics */ *
2 from T
3 WHERE seq = :B2
4 AND pseq = :B1
5 /
no rows selected
ORA$BASE> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID 86qkrwkq3mpcz, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ * from T WHERE seq = :B2 AND pseq
= :B1
Plan hash value: 1303508680
------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 1 | 0 |00:00:00.01 |
|* 2 | INDEX UNIQUE SCAN | T_PK | 1 | 1 | 0 |00:00:00.01 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("SEQ"=:B2 AND "PSEQ"=:B1)
20 rows selected.
Thanks,
Oren.
Hi Oren,
Referring back to the original query, which has the NVL :
I think that there is an inconsistent approach used here by the optimizer, because,
after correctly expanding the NVL to CONCATENATION, it seems to optimize the two branches differently:
When B1 is NULL, the query becomes SEQ = :B2 and PSEQ = PSEQ,
which is the same as SEQ = :B2 and PSEQ IS NOT NULL,
so this is the query that should be optimized ( and not SEQ = :B2 and PSEQ = :B1 )
and the correct plan is chosen for the first branch, using an INDEX RANGE SCAN on T_PK.
When B1 is NOT NULL, the query to be optimized becomes SEQ = :B2 and PSEQ = :B1,
and here the wrong plan is chosen, using index T_IX2.
I don’t see how the optimizer arrived at all to evaluating predicate PSEQ = :B1 when B1 IS NULL,
and thus erroneously conclude that index T_IX2 should be used when B1 IS NOT NULL !
As by my understanding, the expansion of NVL to a CONCATENATION is an heuristic change,
which happens independently of the bind variables peeking.
At most, bind variable peeking can be used after this transformation, while separately optimizing each of the two branches, to choose the best plan for those variable values.
But here it seems to “mix up” the queries of the two branches, likely because of the bind variable peeking.
Regarding the simpler query in your last post, it looks to me that the current version behavior is THE BAD ONE, in comparison with what you call “the BAD OLD ONE”.
For query SEQ = :B2 and PSEQ = :B1, when Oracle peeks at the bind variables and discovers that
B1 is NULL, it can at most conclude heuristically that the query will not return any rows for this particular case, and, if it is basing its execution plan strictly on these bind variables values, it could generate
a plan that does not access the tables at all, but instead returns “no rows selected”,
which is of course wrong for being “adopted” as a shared plan.
In the description of BUG 14127824, the complaint is that the optimizer seems to ignore the fact
that B1 is null, and instead of choosing a cardinality of “1” ( as a rounding of 0 ), it chooses the
“mathematical” cardinality, which in fact is the correct one.
But, in the case of our query, with two predicates, the result of “correcting” that bug was that index T_IX2
was preferred upon index T_PK, which is at least equally wrong.
In my opinion, considering a NULL peeked bind variable value as being representative for a
“low cardinality of 1 row” value is a wrong approach, so it looks to me that the “correction” of that bug
just makes things worse.
I don’t think that a NULL value for a bind variable “does represent more” any low cardinality value
than it does a high cardinality one or any other value.
Even ignoring at all the value of that bind variable could have been a better approach.
Looking from this point of view, it seems that Oren is right, bind variable peeking is indeed a bad thing,
because no variable value can be representative for any other value.
By the way, I know that bind variable peeking could always be disabled, already since it was introduced first, by setting “_OPTIM_PEEK_USER_BINDS ” = FALSE.
With the introduction of adaptive cursor sharing, maybe this is less relevant.
As far as I am aware, using the BIND_AWARE hint ( as asked by a previous poster above )
could have corrected the problem of seeing a correct plan only on the 3rd execution.
Thanks a lot & Best Regards,
Iudith