Richard Foote did a series of informative posts on Index Compression which concludes that there is whole lot of positives about index compression, and very little negatives. But obviously one critical thing is choosing the right number of leading columns to compress. Is it just “take a guess?” . Luckily, Oracle has taken the guesswork out of it.

The ANALYZE command on an index can let you find the optimial compression count. When you do an ANALYZE INDEX command, two critical columns are populated:

SQL> desc index_stats
Name Null? Type
----------------------------- -------- --------------------
HEIGHT NUMBER
BLOCKS NUMBER
NAME VARCHAR2(128)
PARTITION_NAME VARCHAR2(128)
LF_ROWS NUMBER
LF_BLKS NUMBER
LF_ROWS_LEN NUMBER
LF_BLK_LEN NUMBER
BR_ROWS NUMBER
BR_BLKS NUMBER
BR_ROWS_LEN NUMBER
BR_BLK_LEN NUMBER
DEL_LF_ROWS NUMBER
DEL_LF_ROWS_LEN NUMBER
DISTINCT_KEYS NUMBER
MOST_REPEATED_KEY NUMBER
BTREE_SPACE NUMBER
USED_SPACE NUMBER
PCT_USED NUMBER
ROWS_PER_KEY NUMBER
BLKS_GETS_PER_ACCESS NUMBER
PRE_ROWS NUMBER
PRE_ROWS_LEN NUMBER
** OPT_CMPR_COUNT NUMBER
OPT_CMPR_PCTSAVE NUMBER**

Lets look at an example – we’ll create a table with 10 columns with all sorts of interesting distributions

SQL> drop table T purge;
Table dropped.
SQL> create table T as
2 select
3 rownum r01,
4 mod(rownum,10) r02,
5 mod(rownum,100) r03,
6 mod(rownum,1000) r04,
7 trunc(rownum/10) r05,
8 trunc(rownum/100) r06,
9 trunc(dbms_random.value(1,100)) r07,
10 trunc(dbms_random.value(1,1000)) r08 ,
11 trunc(mod(rownum,100)/10) r09,
12 mod(trunc(rownum/10),100) r10
13 from
14 ( select 1 from dual connect by level <= 1000 ),
15 ( select 1 from dual connect by level <= 1000 )
16 /
Table created.

We’ve used a technique described here https://www.youtube.com/watch?v=UonikfFgEyM to generate 1,000,000 rows.

Now with some SQL trickery, we can generate every possible combination of 3 column indexes for this table.

SQL> with inds as ( select rownum r from dual connect by level <= 10 )
2 select '(r'||i1.r||',r'||i2.r||',r'||i3.r||')' ix_cols
3 from inds i1, inds i2, inds i3
4 where i1.r != i2.r
5 and i2.r != i3.r
6 and i1.r != i3.r
7 /
IX_COLS
---------------------------------------------------------------
(r1,r2,r3)
(r1,r2,r4)
(r1,r2,r5)
(r1,r2,r6)
(r1,r2,r7)
(r10,r9,r4)
...
...
(r10,r9,r5)
(r10,r9,r6)
(r10,r9,r7)
(r10,r9,r8)

So, now we’ll create a table called IND_STATS to hold the results of our ANALYZE command, because every time you issue ANALYZE INDEX the INDEX_STATS view is cleared.

SQL> create table IND_STATS as select name, OPT_CMPR_COUNT, OPT_CMPR_PCTSAVE from index_stats where 1=0;
Table created.

And now we’ll just loop around all 720 potential indexes and work out the optimal compression.

SQL> begin
2 for i in (
3 with inds as ( select rownum r from dual connect by level <= 10 )
4 select 'IX_'||i1.r||'_'||i2.r||'_'||i3.r ix_name,
5 '(r'||to_char(i1.r,'fm00')||',r'||to_char(i2.r,'fm00')||',r'||to_char(i3.r,'fm00')||')' ix_cols
6 from inds i1, inds i2, inds i3
7 where i1.r != i2.r
8 and i2.r != i3.r
9 and i1.r != i3.r
10 )
11 loop
12 dbms_application_info.set_client_info(i.ix_name);
13 begin
14 execute immediate 'drop index IX';
15 exception
16 when others then null;
17 end;
18 execute immediate 'create index IX on t '||i.ix_cols;
19 execute immediate 'analyze index IX validate structure';
20 insert into IND_STATS select i.ix_cols, OPT_CMPR_COUNT, OPT_CMPR_PCTSAVE from index_stats;
21 commit;
22 end loop;
23 end;
24 /
PL/SQL procedure successfully completed.

So now we can take a look at the compression recommendations for all of our potential 3 column indexes. (Note: I’m not suggesting you would actually index all 720 combinations ! )

SQL> select * from ind_stats;
NAME OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
---------------------------------------- -------------- ----------------
(r01,r02,r03) 0 0
(r01,r02,r04) 0 0
(r01,r02,r05) 0 0
(r01,r02,r06) 0 0
(r01,r02,r07) 0 0
(r01,r02,r08) 0 0
(r01,r02,r09) 0 0
...
...
(r02,r06,r05) 2 24
(r02,r06,r07) 2 26
(r02,r06,r08) 2 25
(r02,r06,r09) 2 26
(r02,r06,r10) 2 26
(r02,r07,r01) 2 26
(r02,r07,r03) 3 43
(r02,r07,r04) 3 39
(r02,r07,r05) 2 26
(r02,r07,r06) 2 28
(r02,r07,r08) 2 28
(r02,r07,r09) 3 43
(r02,r07,r10) 3 37
(r02,r08,r01) 2 29
(r02,r08,r03) 3 39

You can see that some indexes probably wont benefit at all from compression, whereas for others, ALL columns become candidates for compression and might save nearly half the size of index.

So you dont need to guess at the compression for your indexes – you can work it out using ANALYZE.