Kris Kringle the Database – Index Compression – Learning is not a spectator sport

There is an assortment of compression options available with the Advanced Compression option, but perhaps you are not in the position to invest in that additional license. However, did you know that you can use basic index compression in any version of the database,

create index emp_ix on emp ( deptno, empno) compress 1;

would use de-duplication on the leading DEPTNO column to get more “bang for your buck” out of your index blocks.

But what about a more complex index example?


SQL> create table t as select * from dba_objects;

Table created.

SQL> create index ix on t ( owner, subobject_name, object_type, namespace, sharing, object_id ) compress ????;

Index created.

What level of compression is best? First thing I can do is look at the number of distinct values for each column in USER_TAB_COLS



SQL> select column_name, num_distinct
  2  from user_tab_cols
  3  where table_name = 'T';

COLUMN_NAME                    NUM_DISTINCT
------------------------------ ------------
OWNER                                    48   <<==
OBJECT_NAME                           69560
SUBOBJECT_NAME                         5351   <<==
OBJECT_ID                             88503
DATA_OBJECT_ID                        15496
OBJECT_TYPE                              52   <<==
CREATED                                4744
LAST_DDL_TIME                          5068
TIMESTAMP                              5195
STATUS                                    2
TEMPORARY                                 2
GENERATED                                 2
SECONDARY                                 2
NAMESPACE                                25   <<==
EDITION_NAME                              0
SHARING                                   4   <<==
EDITIONABLE                               2
ORACLE_MAINTAINED                         2
APPLICATION                               1
DEFAULT_COLLATION                         1
DUPLICATED                                1
SHARDED                                   1
IMPORTED_OBJECT                           1
CREATED_APPID                             0
CREATED_VSNID                             0
MODIFIED_APPID                            0
MODIFIED_VSNID                            0

27 rows selected.

Should I compress all columns? OBJECT_ID looks like a unique values, so perhaps I omit that (because there will likely be no duplicates)? But what about SUBOBJECT_NAME? Is 5351 values too many (when combined with the other keys)? These are tough questions to answer.

Luckily the old ANALYZE INDEX command can help us out. I’ll create the index and then VALIDATE the index


SQL> create index ix on t ( owner, subobject_name, object_type, namespace, sharing, object_id );

Index created.

SQL> analyze index ix validate structure;

Index analyzed.

SQL>
SQL> select * from index_stats
  2  @pr
==============================
HEIGHT                        : 3
BLOCKS                        : 768
NAME                          : IX
PARTITION_NAME                :
LF_ROWS                       : 88520
LF_BLKS                       : 676
LF_ROWS_LEN                   : 4824277
LF_BLK_LEN                    : 8000
BR_ROWS                       : 675
BR_BLKS                       : 6
BR_ROWS_LEN                   : 34123
BR_BLK_LEN                    : 8032
DEL_LF_ROWS                   : 0
DEL_LF_ROWS_LEN               : 0
DISTINCT_KEYS                 : 88506
MOST_REPEATED_KEY             : 12
BTREE_SPACE                   : 5456192
USED_SPACE                    : 4858400
PCT_USED                      : 90
ROWS_PER_KEY                  : 1.00015818136623505751022529546019478905
BLKS_GETS_PER_ACCESS          : 4.00007909068311752875511264773009739453
PRE_ROWS                      : 0
PRE_ROWS_LEN                  : 0
OPT_CMPR_COUNT                : 5
OPT_CMPR_PCTSAVE              : 64
DEL_LF_CMP_ROWS               : 0
PRG_LF_CMP_ROWS               : 0
LF_CMP_ROWS                   : 0
LF_CMP_ROWS_LEN               : 0
LF_UNCMP_ROWS                 : 88520
LF_UNCMP_ROWS_LEN             : 4824277
LF_SUF_ROWS_LEN               : 0
LF_CMP_ROWS_UNCMP_LEN         : 0
LF_CMP_RECMP_COUNT            : 0
LF_CMP_LOCK_VEC_LEN           : 0
LF_CMP_BLKS                   : 0
LF_UNCMP_BLKS                 : 676

PL/SQL procedure successfully completed.

SQL>

Notice the columns OPT_CMPR_COUNT and OPT_CMPR_PCTSAVE. This represents the optimal number of columns to compress, and what percentage space savings I should expect. Hence in this case, I would recreate the index with:


SQL> create index ix on t ( owner, subobject_name, object_type, namespace, sharing, object_id ) compress 5;

Index created.

SQL> analyze index ix validate structure;

Index analyzed.

SQL> select * from index_stats
  2  @pr
==============================
HEIGHT                        : 3
BLOCKS                        : 256
NAME                          : IX
PARTITION_NAME                :
LF_ROWS                       : 88520
LF_BLKS                       : 239
LF_ROWS_LEN                   : 1710308
LF_BLK_LEN                    : 7996
BR_ROWS                       : 238
BR_BLKS                       : 3
BR_ROWS_LEN                   : 10707
BR_BLK_LEN                    : 8032
DEL_LF_ROWS                   : 0
DEL_LF_ROWS_LEN               : 0
DISTINCT_KEYS                 : 88506
MOST_REPEATED_KEY             : 12
BTREE_SPACE                   : 1935140
USED_SPACE                    : 1721015
PCT_USED                      : 89
ROWS_PER_KEY                  : 1.00015818136623505751022529546019478905
BLKS_GETS_PER_ACCESS          : 4.00007909068311752875511264773009739453
PRE_ROWS                      : 6301
PRE_ROWS_LEN                  : 305041
OPT_CMPR_COUNT                : 5
OPT_CMPR_PCTSAVE              : 0
DEL_LF_CMP_ROWS               : 0
PRG_LF_CMP_ROWS               : 0
LF_CMP_ROWS                   : 88520
LF_CMP_ROWS_LEN               : 305041
LF_UNCMP_ROWS                 : 0
LF_UNCMP_ROWS_LEN             : 0
LF_SUF_ROWS_LEN               : 1405267
LF_CMP_ROWS_UNCMP_LEN         : 4824277
LF_CMP_RECMP_COUNT            : 0
LF_CMP_LOCK_VEC_LEN           : 0
LF_CMP_BLKS                   : 239
LF_UNCMP_BLKS                 : 0

PL/SQL procedure successfully completed.

I went from 676 lead blocks to 239 leaf blocks, giving me about 65% compression benefit, which is very close to the 64% estimate.

Ho Ho Ho… Merry Christmas!

Footnote: Be aware the ANALYZE INDEX will lock the underlying object. Don’t do this on your live Production system.

5 responses to “Kris Kringle the Database – Index Compression”

Kaley Crum

December 6, 2024 at 5:13 pm

I’ve had some scenarios where ANALYZE INDEX thinks that adding compression will save space, but once key compression is applied, it actually ends up making the index bigger. It’s rare though.

It may be worth it to measure index size before compression vs after compression to make sure it “worked”

1. Connor McDonald
  
  December 6, 2024 at 7:08 pm
  
  Agreed- but didnt my post do exactly that?
  
  1. Kaley Crum
    
    December 10, 2024 at 7:51 pm
    
    It did. My comment was less “your post should have done this” and more “one can’t always trust the estimate provided to you by ANALYZE INDEX.” In your case, the estimate was close to the actual space savings (64% vs 65%) but I’m saying I’ve had some scenarios where it’s been waaay off, and even as far as expecting space savings but actually increasing the size of the index.
    
2. Rajeshwaran Jeyabal
  
  December 10, 2024 at 1:07 pm
  
  In that case we need to consider using Advance index compression, which enables Oracle database to automatically just compress only those index leaf blocks where compression is beneficial.
  
  1. Kaley Crum
    
    December 10, 2024 at 7:52 pm
    
    Extra license cost