One of the cool features in 12c is the automatic collection of optimizer statistics when a table is either created or loaded via direct path from empty. This makes a lot of sense because it saves us from what used to be the mandatory second step of gathering statistics whenever we loaded an empty table with data.
For example
11.2.0.4
SQL> create table t1 as
2 select * from dba_objects;
Table created.
SQL>
SQL> select num_rows
2 from user_tables
3 where table_name = 'T1';
NUM_ROWS
----------
(null)
SQL> select count(*) from t1;
COUNT(*)
----------
86802
So at this point, we’d then have to scan the table we just loaded with a DBMS_STATS call in order to come up with some representative statistics. This was fixed in 12c with “statistics on load”.
12c
SQL> create table t1 as
2 select * from dba_objects;
Table created.
SQL> – no dbms_stats call
SQL> select num_rows
2 from user_tables
3 where table_name = 'T1';
NUM_ROWS
----------
78170
But this facility has been blogged about extensively, so I wont labour the point. This post is more about just a couple of things to be aware of when taking advantage of the facility.
Partitioned tables
The 12c feature is for statistics on load of the table. So if you are creating/loading a partitioned table, whilst you will still get statistics collected, they are at the table level only. For example,
SQL> create table t1
2 partition by range ( object_id )
3 interval ( 20000 )
4 (
5 partition p1 values less than ( 20000 ),
6 partition p2 values less than ( 200000 )
7 )
8 as select d.* from dba_objects d
9 where object_id is not null;
Table created.
SQL>
SQL> select num_rows
2 from user_tables
3 where table_name = 'T1';
NUM_ROWS
----------
78165
SQL>
SQL> select partition_name, num_rows
2 from user_tab_partitions
3 where table_name = 'T1';
PARTITION_ NUM_ROWS
---------- ----------
P1 (null)
P2 (null)
Notice that the partition level statistics are not collected. Interestingly, indexes however will be ok whether they are created during or after the load.
--
-- indexes created after the load on the table above
--
SQL> create index ix1 on t1 ( object_id ) local;
Index created.
SQL> create index ix2 on t1 ( object_name);
Index created.
SQL>
SQL> select index_name, leaf_blocks
2 from user_indexes
3 where index_name in ( 'IX1','IX2');
INDEX_NAME LEAF_BLOCKS
------------------------------ -----------
IX1 173
IX2 535
2 rows selected.
SQL>
SQL> select index_name, partition_name, leaf_blocks
2 from user_ind_partitions
3 where index_name in ( 'IX1','IX2');
INDEX_NAME PARTITION_ LEAF_BLOCKS
------------------------------ ---------- -----------
IX1 P1 43
IX1 P2 130
--
-- indexes created as part of the table creation process
--
SQL> create table t1 (
2 owner
3 ,object_name
4 ,subobject_name
5 ,object_id
6 ,data_object_id
7 ,object_type
8 ,created
9 ,last_ddl_time
10 ,timestamp
11 ,status
12 ,temporary
13 ,generated
14 ,secondary
15 ,namespace
16 ,edition_name
17 ,sharing
18 ,editionable
19 ,oracle_maintained
20 ,application
21 ,default_collation
22 ,duplicated
23 ,sharded
24 ,created_appid
25 ,created_vsnid
26 ,modified_appid
27 ,modified_vsnid
28 ,constraint pk primary key ( object_id ) using index local
29 ,constraint uq unique ( owner, object_id ) using index
30 )
31 partition by range ( object_id )
32 interval ( 20000 )
33 (
34 partition p1 values less than ( 20000 ),
35 partition p2 values less than ( 200000 )
36 )
37 as select d.* from dba_objects d
38 where object_id is not null;
Table created.
SQL>
SQL>
SQL> select index_name, leaf_blocks
2 from user_indexes
3 where index_name in ('PK','UQ');
INDEX_NAME LEAF_BLOCKS
------------------------------ -----------
PK 163
UQ 263
2 rows selected.
SQL>
SQL> select index_name, partition_name, leaf_blocks
2 from user_ind_partitions
3 where index_name in ('PK','UQ');
INDEX_NAME PARTITION_ LEAF_BLOCKS
------------------------------ ---------- -----------
PK P1 41
PK P2 122
2 rows selected.
CTAS empty is still OK
A common trick for creating an empty table whilst copying the definition of an existing table is to use a predicate with an always false condition. For example,
SQL> create table t1 as
2 select * from dba_objects
3 where 1=0;
Table created.
Statistics on load will still take place for such a table, ie
SQL> select num_rows
2 from user_tables
3 where table_name = 'T1';
NUM_ROWS
----------
0
Now you might be worried that since there are now statistics on the table, that perhaps a direct load using INSERT /*+ APPEND */, which would normally perform statistics on load will no longer work. But there is no need to panic. We’ll still determine that the table is empty, and do another statistics on load operation when you perform that direct load.
SQL> insert /*+ APPEND */ into t1
2 select * from dba_objects;
78171 rows created.
SQL> commit;
Commit complete.
SQL>
SQL> select num_rows
2 from user_tables
3 where table_name = 'T1';
NUM_ROWS
----------
78171
But this also has implications if you want to retain some existing statistics on the table. Because when we truncate the table, that same reset of statistics collection can occur.
SQL> create table t1 as
2 select * from dba_objects
3 where rownum
SQL> select num_rows
2 from user_tables
3 where table_name = 'T1';
NUM_ROWS
----------
10
SQL>
SQL> truncate table t1;
Table truncated.
SQL>
SQL> insert /*+ APPEND */ into t1
2 select * from dba_objects;
78171 rows created.
SQL> commit;
Commit complete.
SQL>
SQL> select num_rows
2 from user_tables
3 where table_name = 'T1';
NUM_ROWS
----------
78171
Similarly, all you need is one transaction that “brings the table to life” for automatic statistics collection to be no longer active. Even if that transactions rolls back. For example:
SQL> create table t1 as
2 select * from dba_objects
3 where 1=0;
Table created.
SQL> select num_rows
2 from user_tables
3 where table_name = 'T1';
NUM_ROWS
----------
0
SQL> -- a normal mode insert of a row
SQL> insert into t1
2 select * from dba_objects where rownum = 1;
1 row created.
SQL>
SQL> rollback;
Rollback complete.
SQL>
SQL> insert /*+ APPEND */ into t1
2 select * from dba_objects;
78171 rows created.
SQL> commit;
Commit complete.
SQL>
SQL> select num_rows
2 from user_tables
3 where table_name = 'T1';
NUM_ROWS
----------
0
All of these behaviour are pretty much how I would expect them to work (your opinion may differ of course ), and I know that some people have opted to disable the automatic collection altogether to avoid any ambiguity but that strikes me as overkill.
If you are really concerned about it, it is easy enough to suffix your load routines with a simple wrapper function to check the statistics and issue the appropriate DBMS_STATS call to make up the shortfall. And it would be remiss of me not to mention the latest optimizer whitepaper which covers in detail other optimizer enhancements in 12c.
Happy stats collecting !
Not sure it’s worth mentioning, doing it just in case (I’m ok if you prefer to reject the comment 🙂
Stats are gathered just at the partition level in case you specify extended partition syntax (hopefully that’s the right name), assuming anybody does that.
create table tc (object_id number, object_name varchar2(30))
partition by range ( object_id )
interval ( 20000 )
(
partition p1 values less than ( 20000 ),
partition p2 values less than ( 200000 )
);
insert /*+ APPEND */ into tc partition (p1) select mod(rownum, 20000), lpad(‘x’,30,’x’) from dual connect by rownum select num_rows from user_tables where table_name =’TC’;
NUM_ROWS
———-
Elapsed: 00:00:00.19
SQL> select partition_name, num_rows from user_tab_partitions where table_name = ‘TC’;
PARTITION_NAME NUM_ROWS
—————————— ———-
P1 1000
P2
Cheers,
Mauro
Thanks for stopping by Mauro !
With respect to: “if you prefer to reject the comment”, conversely, I think that’s actually an important case, because to leave the table stats untouched could lead to nasty contradictions like:
SQL> create table tc (object_id number, object_name varchar2(30))
2 partition by range ( object_id )
3 interval ( 20000 )
4 (
5 partition p1 values less than ( 20000 ),
6 partition p2 values less than ( 200000 )
7 );
Table created.
SQL>
SQL> insert /*+ APPEND */ into tc
2 select 100000, ‘x’ from dual connect by level commit;
Commit complete.
SQL>
SQL> select num_rows from user_tables where table_name =’TC’;
NUM_ROWS
———-
100
1 row selected.
SQL> select partition_name, num_rows from user_tab_partitions where table_name = ‘TC’;
PARTITION_ NUM_ROWS
———- ———-
P1
P2
2 rows selected.
SQL> insert /*+ APPEND */ into tc partition (p1)
2 select mod(rownum, 20000), lpad(‘x’,30,’x’) from dual connect by rownum
SQL> select num_rows from user_tables where table_name =’TC’;
NUM_ROWS
———-
100
1 row selected.
SQL> select partition_name, num_rows from user_tab_partitions where table_name = ‘TC’;
PARTITION_ NUM_ROWS
———- ———-
P1 40000
P2
2 rows selected.
Cheers,
Connor