You might be thinking that FIRST_ROWS is pretty much the same as FIRST_ROWS_1, since they are both after the first row. However, there are some important distinctions to make here. The traditional ‘FIRST_ROWS’ parameter is not a genuine cost-based option, more a hybrid between the rule and cost. This can be observed via the following example:
SQL> create table T
2 as select rownum x, rpad(rownum,100,'x') padding
3 from dual
4 connect by level <= 2000
5 order by dbms_random.value;
Table created.
SQL>
SQL> create index IX on T ( x ) ;
Index created.
SQL> set autotrace traceonly explain
SQL>
SQL> select * from T where x < 100;
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 99 | 10395 | 12 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T | 99 | 10395 | 12 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("X"<100)
SQL> alter session set optimizer_mode = first_rows;
Session altered.
SQL>
SQL> set autotrace traceonly explain
SQL> select * from T where x < 100;
Execution Plan
----------------------------------------------------------
Plan hash value: 3947747388
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 99 | 10395 | 99 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T | 99 | 10395 | 99 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IX | 99 | | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("X"<100)
Notice that the optimizer has chosen a path with a HIGHER cost, so in the case of first_rows, it is not solely the cost that plays a part as a determining factor. There are some heuristics in place to bias the optimizer toward index access.
You could quite naturally assume that the FIRST_ROWS_1 is the same as FIRST_ROWS, but this is NOT the case. The optimizer use the cost calculation and no other heuristics. You can see that the ‘n’ in FIRST_ROWS_n as an important driver in determining cardinalities for query result sets.
SQL> alter session set optimizer_mode = first_rows_1;
Session altered.
SQL> select * from T where x < 100;
Execution Plan
----------------------------------------------------------
Plan hash value: 3947747388
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 210 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T | 2 | 210 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IX | | | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("X"<100)
SQL> alter session set optimizer_mode = first_rows_10;
Session altered.
SQL> select * from T where x < 100;
Execution Plan
----------------------------------------------------------
Plan hash value: 3947747388
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 11 | 1155 | 13 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T | 11 | 1155 | 13 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IX | | | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("X"<100)
SQL>
SQL> alter session set optimizer_mode = first_rows_100;
Session altered.
SQL> select * from T where x < 100;
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 99 | 10395 | 12 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T | 99 | 10395 | 12 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("X"<100)
SQL>
SQL>
So we can start to see the methodology at work here. The optimizer is “taking our word” in that, (independent of what the table statistics imply), it will only need to use the first ‘n’ rows in the relevant source. It is a true cost adjustment rather than a change to the optimization scheme (as the original FIRST_ROWS is)