Why indexes are so fast at finding a key

Posted by

Glance through any text book, blog post, video tutorial etc on the topic of database indexes and you’ll typically chance upon a pictorial representation of the index structure along the following lines:

image

There is one block at the top representing the root, which blossoms out to branches and ultimately to leaves at the bottom. There is that skeuomorphic appeal to presenting a structure called a “B-tree” as … well .. a tree Smile. When we think of a tree in nature, we naturally (see what I did there? Smile) think about the branches regularly splitting into smaller branches ad infinitum until we get to the leaves where our index data resides.

image

But whilst appealing to the eye, it does not reflect the reality of an index structure in the Oracle database. Rather think about a tree, you are better off thinking more like a shrub…a really really flat, wide shrub. Consider the following demo where I repeatedly load a table which is indexed on a simple ascending sequence number, with an ever increasing amount of rows, and then check the height of the index structure with each iteration:



SQL> create table t ( x int );

Table created.

SQL> create unique index ix on t ( x );

Index created.

SQL> set serverout on
SQL> declare
  2    lev int;
  3  begin
  4    for i in 0 .. 30
  5    loop
  6      execute immediate 'truncate table t';
  7      execute immediate 'drop index ix';
  8      insert /*+ APPEND */ into t
  9      select rownum
 10      from ( select 1 from dual connect by level <= 1000 ),
 11           ( select 1 from dual connect by level <= 1000 ),
 12           ( select 1 from dual connect by level <= 1000 )
 13      where rownum < power(2,i);
 14      execute immediate 'create unique index ix on t ( x ) ';
 15
 16      select blevel into lev from user_indexes
 17      where index_name = 'IX';
 18      dbms_output.put_line(power(2,i)||' '||lev);
 19      exit when lev=3;
 20    end loop;
 21  end;
 22  /
1 0
2 0
4 0
8 0
16 0
32 0
64 0
128 0
256 0
512 1
1024 1
2048 1
4096 1
8192 1
16384 1
32768 1
65536 1
131072 1
262144 1
524288 2
1048576 2
2097152 2
4194304 2
8388608 2
16777216 2
33554432 2
67108864 2
134217728 2
268435456 3

I’m measuring the BLEVEL which is always one less than what we would consider the “height” of an index, because BLEVEL is the number of branch levels. From the output, you can observe:

  • For less than 500 rows, the index is simply a leaf (which is also the root). There are no branches at all.
  • Just one level of branches can accommodate somewhere between 260,000 and 512,000 rows.
  • Two levels of branches gets me to over 130 million rows

Given that each level is going up by approximately 1000x, the third branch will probably support hundreds of billions of rows. (I was not prepared to wait for the script to find the upper bound!). It is not even possible to really illustrate that amount of “flatness” or “broadness” of an index, but here’s my best attempt Smile

image

This means that locating a single index key amongst billions of rows, the ultimate “needle in a haystack” will only require 3 hops before we land on the correct leaf. That is a very wide shrub!

But one parting thought: Notice I titled this post “finding a key” not “finding whatever data you want in a table”. Due to the flat structure, it will always be incredibly fast to find that first entry in an index for a nominated index key. That however is far cry from saying that indexes are always the best mechanism to find data in your table!

Got some thoughts? Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.