Hyper-partitioned index avoidance thingamajig

As you can tell, I have no idea on a name for what I am about to describe. So let me start from the beginning, and set the scene for an idea I have to utilize a cool new 18c feature.

Often in a transactional-style system the busiest table (let us call it SALES for the sake of this discussion) is also

  • the biggest table, after all, it has all of our sales in it,
  • the most demanded for table, in that, almost every query in our application wants to access it in some way shape or form.

This is in effect the database version of the Pareto Principle. Everyone wants a slice of that SALES “pie”, and the piece of that pie that is in most demand is typically the most recent data. Your application may have pages that will be showing:

  • sales in the past hour,
  • sales in the past day,
  • products sold in the past hour,
  • largest purchase amount for sales in the past few hours,
  • suspicious activity for today,
  • A customer will want their sales for the last few days

The list goes on and on, the common theme being that the data being requested is bound by a range of time in the recent past. Our SALES table already will have a primary key, probably being some sort of unique transaction ID for each purchase, but to the satisfy the style of requests in the list above, we will probably need:

  • an index on the time(stamp) of the sale,
  • an index on the product ID that was sold,
  • an index on the customer ID who made the purchase,
  • potentially even an index on the sale amount

You can see the troubled waters into which we are sailing here. It is our biggest and busiest table, and here we are, adding index after index after index to improve query performance, whilst at the same time:

  1. adding overhead/contention to DML operations on the SALES table,
  2. increasing the size of the database,
  3. increasing the duration of the backups,
  4. increasing time for maintenance and copies to Development and Test

None of this is looking great but we might be thinking “What choice do we have?”

Here is perhaps an alternate strategy: Let’s not create any indexes.  The challenges (1) through (4) above evaporate to non-existence. But of course, now we have a remaining challenge in getting those queries to run efficiently.

I’ll tackle that in a different way – I will partition the SALES table very “aggressively”, hence my blog post title “hyper-partitioning”. Considering the typical query requirements I listed above, I will partition my SALES down to as small as a partition for every hour.


SQL> create table sales
  2    ( ts   timestamp,
  3      id   int,
  4      amt  number,
  5      product int,
  6      customer int,
  7      item_cnt int,
  8      terminal int,
  9      operator int,
 10      credit_card int,
         ...
         ...
 17    )
 18  partition by range ( ts )
 19  interval ( numtodsinterval(1,'HOUR') )
 20  (
 21    partition p1 values less than ( timestamp '2018-07-01 00:00:00' )
 22  );

Table created.

SQL>
SQL> insert /*+ APPEND */ into sales (ts,id,amt,product,customer)
  2  select date '2018-07-01' + rownum / 240, rownum, dbms_random.value(1,100),
  3        dbms_random.value(1,100),dbms_random.value(1,100)
  4  from dual
  5  connect by level 
SQL> set serverout on
SQL> declare
  2    h varchar2(1000);
  3  begin
  4  for i in (
  5    select partition_name, high_value
  6    from user_tab_partitions
  7    where table_name = 'SALES'
  8    and   interval = 'YES'
  9    and   partition_name like 'SYS_P%'
 10    order by partition_position
 11  ) loop
 12    h := i.high_value;
 13    execute immediate 'select to_char('||h||'-0.01,''yyyymmdd_hh24'') from dual' into h;
 14    execute immediate 'alter table sales rename partition '||i.partition_name||' to p'||h;
 15  end loop;
 16  end;
 17  /

PL/SQL procedure successfully completed.

SQL>
SQL> col high_value format a60
SQL> select partition_name, high_value
  2  from user_tab_partitions
  3  where table_name = 'SALES'
  4  order by partition_position;

PARTITION_NAME                 HIGH_VALUE
------------------------------ ------------------------------------------------------------
P1                             TIMESTAMP' 2018-07-01 00:00:00'
P20180701_00                   TIMESTAMP' 2018-07-01 01:00:00'
P20180701_01                   TIMESTAMP' 2018-07-01 02:00:00'
P20180701_02                   TIMESTAMP' 2018-07-01 03:00:00'
P20180701_03                   TIMESTAMP' 2018-07-01 04:00:00'
P20180701_04                   TIMESTAMP' 2018-07-01 05:00:00'
P20180701_05                   TIMESTAMP' 2018-07-01 06:00:00'
P20180701_06                   TIMESTAMP' 2018-07-01 07:00:00'
P20180701_07                   TIMESTAMP' 2018-07-01 08:00:00'
P20180701_08                   TIMESTAMP' 2018-07-01 09:00:00'
P20180701_09                   TIMESTAMP' 2018-07-01 10:00:00'
P20180701_10                   TIMESTAMP' 2018-07-01 11:00:00'
P20180701_11                   TIMESTAMP' 2018-07-01 12:00:00'
P20180701_12                   TIMESTAMP' 2018-07-01 13:00:00'
P20180701_13                   TIMESTAMP' 2018-07-01 14:00:00'
P20180701_14                   TIMESTAMP' 2018-07-01 15:00:00'
P20180701_15                   TIMESTAMP' 2018-07-01 16:00:00'
P20180701_16                   TIMESTAMP' 2018-07-01 17:00:00'
P20180701_17                   TIMESTAMP' 2018-07-01 18:00:00'
P20180701_18                   TIMESTAMP' 2018-07-01 19:00:00'
P20180701_19                   TIMESTAMP' 2018-07-01 20:00:00'
P20180701_20                   TIMESTAMP' 2018-07-01 21:00:00'
P20180701_21                   TIMESTAMP' 2018-07-01 22:00:00'
P20180701_22                   TIMESTAMP' 2018-07-01 23:00:00'
P20180701_23                   TIMESTAMP' 2018-07-02 00:00:00'
P20180702_00                   TIMESTAMP' 2018-07-02 01:00:00'
P20180702_01                   TIMESTAMP' 2018-07-02 02:00:00'
P20180702_02                   TIMESTAMP' 2018-07-02 03:00:00'
P20180702_03                   TIMESTAMP' 2018-07-02 04:00:00'
P20180702_04                   TIMESTAMP' 2018-07-02 05:00:00'
P20180702_05                   TIMESTAMP' 2018-07-02 06:00:00'
P20180702_06                   TIMESTAMP' 2018-07-02 07:00:00'
P20180702_07                   TIMESTAMP' 2018-07-02 08:00:00'
P20180702_08                   TIMESTAMP' 2018-07-02 09:00:00'
P20180702_09                   TIMESTAMP' 2018-07-02 10:00:00'
P20180702_10                   TIMESTAMP' 2018-07-02 11:00:00'
P20180702_11                   TIMESTAMP' 2018-07-02 12:00:00'
P20180702_12                   TIMESTAMP' 2018-07-02 13:00:00'
P20180702_13                   TIMESTAMP' 2018-07-02 14:00:00'
P20180702_14                   TIMESTAMP' 2018-07-02 15:00:00'
P20180702_15                   TIMESTAMP' 2018-07-02 16:00:00'

41 rows selected.

I’ve run a small anonymous block to rename the (system-named) interval partitions into some sensible names to reflect the date range the partition covers. Let’s now look at the typical queries we will now be performing on the SALES table:


SQL> set autotrace traceonly explain
SQL> select * from sales
  2  where ts > timestamp '2018-07-02 15:00:00';

--------------------------------------------------------------------------------------------------
| Id  | Operation                | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |       |     9 |   387 |     7   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ITERATOR|       |     9 |   387 |     7   (0)| 00:00:01 |    41 |1048575|
|*  2 |   TABLE ACCESS FULL      | SALES |     9 |   387 |     7   (0)| 00:00:01 |    41 |1048575|
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("TS">TIMESTAMP' 2018-07-02 15:00:00.000000000')

SQL> select * from sales
  2  where ts > timestamp '2018-07-02 15:00:00'
  3  and product = 12;

--------------------------------------------------------------------------------------------------
| Id  | Operation                | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |       |     1 |    43 |     7   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ITERATOR|       |     1 |    43 |     7   (0)| 00:00:01 |    41 |1048575|
|*  2 |   TABLE ACCESS FULL      | SALES |     1 |    43 |     7   (0)| 00:00:01 |    41 |1048575|
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("PRODUCT"=12 AND "TS">TIMESTAMP' 2018-07-02 15:00:00.000000000')

SQL>
SQL> select * from sales
  2  where ts > timestamp '2018-07-02 15:00:00'
  3  and customer = 25;

--------------------------------------------------------------------------------------------------
| Id  | Operation                | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |       |     1 |    43 |     7   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ITERATOR|       |     1 |    43 |     7   (0)| 00:00:01 |    41 |1048575|
|*  2 |   TABLE ACCESS FULL      | SALES |     1 |    43 |     7   (0)| 00:00:01 |    41 |1048575|
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("CUSTOMER"=25 AND "TS">TIMESTAMP' 2018-07-02 15:00:00.000000000')

SQL>
SQL> select max(amt) from sales
  2  where ts > timestamp '2018-07-02 15:00:00';

---------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |       |     1 |    33 |     7   (0)| 00:00:01 |       |       |
|   1 |  SORT AGGREGATE           |       |     1 |    33 |            |          |       |       |
|   2 |   PARTITION RANGE ITERATOR|       |     9 |   297 |     7   (0)| 00:00:01 |    41 |1048575|
|*  3 |    TABLE ACCESS FULL      | SALES |     9 |   297 |     7   (0)| 00:00:01 |    41 |1048575|
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("TS">TIMESTAMP' 2018-07-02 15:00:00.000000000')

SQL> set autotrace off

All of them scan a tiny portion of the data, namely, just the hours of sales data relevant to the query, and the query response times will be relatively consistent for all cases no matter which customer, product or other predicate will be passed because the data to be scanned is a fixed number of hours.

But there’s a problem here. If I am partitioning to the hour, or even to the minute…then it won’t be long before I have a lot of partitions. In the latter case (minutes) I will be up to over 500,000 partitions in the first year of SALES alone! That is a lot of database metadata to store. There is the partitions themselves, plus optimizer statistics on them, plus historical optimizer statistics, plus potentially histograms on every column.  A lot of optimizer data might lead to expensive parse times because there is just so much information to wade through when optimizing queries.

But we only need the extreme granularity of partitions for the SALES table for today. Once today ticks over and becomes “yesterday”, then we might only need a partition for the entire day.  And once “yesterday” ticks over to “last week”, then maybe only weekly partitions are needed and so forth.

One of the cool things in 18c is ability to do this style of maintenance with negligible disruption to service. Because I have named my partitions in a logical fashion, here is a simple routine to merge “yesterdays” hourly partitions into a single one for the day.


SQL> set serverout on
SQL> declare
  2    d date := date '2018-07-01';
  3    ddl varchar2(4000);
  4  begin
  5    select listagg(partition_name||chr(10),',') within group ( order by partition_position )
  6    into   ddl
  7    from   user_tab_partitions
  8    where  table_name = 'SALES'
  9    and    partition_name like 'P'||to_char(d,'yyyymmdd')||'%';
 10
 11    ddl := 'alter table sales merge partitions '||ddl||' into partition p'||to_char(d,'yyyymmdd')||' online';
 12
 13    dbms_output.put_line(ddl);
 14    execute immediate ddl;
 15  end;
 16  /
alter table sales merge partitions
 P20180701_00
,P20180701_01
,P20180701_02
,P20180701_03
,P20180701_04
,P20180701_05
,P20180701_06
,P20180701_07
,P20180701_08
,P20180701_09
,P20180701_10
,P20180701_11
,P20180701_12
,P20180701_13
,P20180701_14
,P20180701_15
,P20180701_16
,P20180701_17
,P20180701_18
,P20180701_19
,P20180701_20
,P20180701_21
,P20180701_22
,P20180701_23
 into partition p20180701 online

PL/SQL procedure successfully completed.

Now I have a single partition for yesterday’s data, and hourly partitions for today’s data.


SQL> col high_value format a60
SQL> select partition_name, high_value
  2  from user_tab_partitions
  3  where table_name = 'SALES'
  4  order by partition_position;

PARTITION_NAME                 HIGH_VALUE
------------------------------ -------------------------------
P1                             TIMESTAMP' 2018-07-01 00:00:00'
P20180701                      TIMESTAMP' 2018-07-02 00:00:00'
P20180702_00                   TIMESTAMP' 2018-07-02 01:00:00'
P20180702_01                   TIMESTAMP' 2018-07-02 02:00:00'
P20180702_02                   TIMESTAMP' 2018-07-02 03:00:00'
P20180702_03                   TIMESTAMP' 2018-07-02 04:00:00'
P20180702_04                   TIMESTAMP' 2018-07-02 05:00:00'
P20180702_05                   TIMESTAMP' 2018-07-02 06:00:00'
P20180702_06                   TIMESTAMP' 2018-07-02 07:00:00'
P20180702_07                   TIMESTAMP' 2018-07-02 08:00:00'
P20180702_08                   TIMESTAMP' 2018-07-02 09:00:00'
P20180702_09                   TIMESTAMP' 2018-07-02 10:00:00'
P20180702_10                   TIMESTAMP' 2018-07-02 11:00:00'
P20180702_11                   TIMESTAMP' 2018-07-02 12:00:00'
P20180702_12                   TIMESTAMP' 2018-07-02 13:00:00'
P20180702_13                   TIMESTAMP' 2018-07-02 14:00:00'
P20180702_14                   TIMESTAMP' 2018-07-02 15:00:00'
P20180702_15                   TIMESTAMP' 2018-07-02 16:00:00'

18 rows selected.

I stress – this is not my recommendation to race out and partition every transactional table you have, and drop all of the indexes Smile. But it with so many online partitioning operations in 18c, it raises some exciting new opportunities there were not available in previous releases. So start thinking about how you can exploit this to get advantages with the partitioning option that might sit “outside the box” of the standard usage.

18c merge partition online

One of the cool things in 18c is the ability to merge partitions without causing a service interruption.  Here’s a video demonstration of that in action:

This is just an accompanying blog post to let you grab the scripts for the demo so that you can try this yourself on livesql.oracle.com, or any of the Oracle Cloud services that will be running 18c in the near future.

But also, make sure that you watch to the end of the video, because I also discuss some of the opportunities that this new feature provides.  I would contend that 99% of people using partitioning in their real world applications rarely change the size or structure of their partitions.  With online merge, you now have much more freedom in those choices.  For analytics, you might now be able to run workloads against your live Production transaction processing systems by utilizing a fine-grained partitioning regime, and later merging those partitions to become larger granules over time.

So try to think “outside of the box” on this one.  There could be a lot of cool benefits for you that you might not have initially considered.  The true feature is not “just” merge online – it is that you have new partitioning designs and implementations at your fingertips.


 create table sales
   ( ts        date,
     id        int,
     amt       number,
     completed varchar2(1)
   )
 partition by range ( ts )
 interval ( numtodsinterval(1,'DAY') )
 (
   partition p1 values less than ( date '2017-01-01' ),
   partition p2 values less than ( date '2017-02-01' ),
   partition p3 values less than ( date '2017-03-01' ),
   partition p4 values less than ( date '2017-04-01' ),
   partition p5 values less than ( date '2017-05-01' )
 );

 insert /*+ APPEND */ into sales
 select date '2017-01-01' - 10 + rownum / 10000 ,rownum, dbms_random.value(1,100), null
 from dual
 connect by level < 10000*120;


 exec dbms_stats.gather_table_stats('','SALES') 

 select partition_name, num_rows
 from user_tab_partitions
 where table_name = 'SALES'
 order by partition_position;

 create index sales_ix on sales ( id );

 alter table sales merge partitions p3, p4 
     into partition p3a update indexes ONLINE;

MERGE and IOT’s ….. unhappy bedfellows

Anyone who has used Oracle for a while will be familiar with the Parent/Child locking “issue” when it comes to tables and indexes on foreign keys. For many years you’d hear people crying “bug” etc but thankfully most now know the reason, and accept it as sensible behaviour.

But lets take a look at a slight variation on that theme.

Lets start with a table called “LOC” which will be our parent table in this example. Note that it is an IOT, and we’ll also have a child table “LOC_CHILD”, which is also an IOT.

SQL> CREATE TABLE LOC
  2  (
  3    LOC_ID              NUMBER(4)            NOT NULL,
  4    DATA                     NUMBER(6),
  5    CONSTRAINT LOC_PK
  6    PRIMARY KEY
  7    ( LOC_ID)
  8    ENABLE VALIDATE
  9  )
 10  ORGANIZATION INDEX
 11  /

Table created.

SQL> CREATE TABLE LOC_CHILD
  2  (
  3    CHILD_SEQ                  NUMBER(12)          NOT NULL,
  4    LOC_ID               NUMBER(4)           NOT NULL,
  5    CHILD_DATA  NUMBER(15,6),
  6    CONSTRAINT LOC_CHILD_PK
  7    PRIMARY KEY
  8    (CHILD_SEQ, LOC_ID)
  9    ENABLE VALIDATE
 10  )
 11  ORGANIZATION INDEX
 12  /

Table created.

SQL> insert into LOC
  2  select rownum,50
  3  from dual
  4  connect by level <= 5
  5  /

5 rows created.

Now being a good DBA 🙂 we’ve read all the “gloom and doom” nonsense about foreign keys being indexed, so just to be careful, we’ll add that index onto our child table before adding our foreign key back to LOC.

SQL> CREATE INDEX LOC_CHILD_IX ON LOC_CHILD
  2  (LOC_ID)
  3  /

Index created.

SQL> ALTER TABLE LOC_CHILD ADD (
  2    CONSTRAINT LOC_CHILD_FK
  3    FOREIGN KEY ( LOC_ID)
  4    REFERENCES LOC (LOC_ID)
  5    ENABLE VALIDATE)
  6  /

Table altered.

SQL> insert into LOC_CHILD
  2  select rownum,mod(rownum,5)+1,dbms_random.value(1000,5000)
  3  from dual
  4  connect by level  commit;

Commit complete.

So the scene is set..we’ve got our tables seeded with some data, and ready to go..

Lets update a row in the parent table LOC:

SQL> UPDATE loc
  2  SET    DATA = 99
  3  WHERE  LOC_ID = 2;

1 row updated.

Now we’ll pop into a new session and update the child table LOC_CHILD:

SQL> MERGE 
  2       INTO  LOC_CHILD
  3       USING (SELECT 500 CHILD_SEQ,
  4                     2 LOC_ID,
  5                     1000 CHILD_DATA
  6                FROM DUAL) M
  7          ON (    LOC_CHILD.CHILD_SEQ = M.CHILD_SEQ
  8              AND LOC_CHILD.LOC_ID = M.LOC_ID)
  9  WHEN MATCHED
 10  THEN
 11     UPDATE SET
 12        LOC_CHILD.CHILD_DATA =  NVL (LOC_CHILD.CHILD_DATA, 0) + M.CHILD_DATA
 13  WHEN NOT MATCHED
 14  THEN
 15     INSERT     (CHILD_SEQ,
 16                 LOC_ID,
 17                 CHILD_DATA)
 18         VALUES (M.CHILD_SEQ,
 19                 M.LOC_ID,
 20                 M.CHILD_DATA);

[stuck]

And splat…we’re stuck. (Without evidence to support it) I’d hypothesize its due to the flexibility of the merge command. A single merge can insert, update and even delete rows, so I’m guessing that to handle this flexibility (in particular, the DELETE option) then the locking errs on the side of safety.

MERGE – concise syntax but not always fastest

A long time ago … a long long time ago (http://www.oracledba.co.uk/tips/merge_speed.htm) I wrote that MERGE works best over update and insert.

This still holds true, but its also apparent that MERGE seems to be optimized for larger sets of rows. When you take it down to single row operations, then don’t be SO hasty to recast your updates-and-inserts into merge commands.

SQL> create table t1
  2   ( x int primary key,
  3     y int );

Table created.

SQL>
SQL> -- all inserts
SQL>
SQL> set timing on
SQL> begin
  2  for i in 1 .. 50000 loop
  3    merge into t1
  4    using ( select i x, i y
  5            from dual ) m
  6    on ( t1.x = m.x )
  7    when matched then update
  8    set t1.y = m.y
  9    when not matched then
 10    insert values (m.x, m.y );
 11  end loop;
 12  end;
 13  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:10.79
SQL>
SQL> -- 50/50
SQL>
SQL> set timing on
SQL> begin
  2  for i in 25000 .. 75000 loop
  3    merge into t1
  4    using ( select i x, i y
  5            from dual ) m
  6    on ( t1.x = m.x )
  7    when matched then update
  8    set t1.y = m.y
  9    when not matched then
 10    insert values (m.x, m.y );
 11  end loop;
 12  end;
 13  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:10.31
SQL>
SQL> -- all updates
SQL>
SQL> set timing on
SQL> begin
  2  for i in 25000 .. 75000 loop
  3    merge into t1
  4    using ( select i x, i+1 y
  5            from dual ) m
  6    on ( t1.x = m.x )
  7    when matched then update
  8    set t1.y = m.y
  9    when not matched then
 10    insert values (m.x, m.y );
 11  end loop;
 12  end;
 13  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:10.14
SQL>
SQL>
SQL> drop table t1 purge;

Table dropped.

Elapsed: 00:00:00.35
SQL>
SQL> create table t1
  2   ( x int primary key,
  3     y int );

Table created.

Elapsed: 00:00:00.03
SQL>
SQL> -- all inserts
SQL> set timing on
SQL> begin
  2  for i in 1 .. 50000 loop
  3    update t1 set y = i where x = i;
  4    if sql%notfound then insert into t1 values (i,i); end if;
  5  end loop;
  6  end;
  7  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:11.65
SQL>
SQL> -- 50/50
SQL>
SQL> set timing on
SQL> begin
  2  for i in 25000 .. 75000 loop
  3    update t1 set y = i where x = i;
  4    if sql%notfound then insert into t1 values (i,i); end if;
  5  end loop;
  6  end;
  7  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:09.20
SQL>
SQL> -- all updates
SQL>
SQL> set timing on
SQL> begin
  2  for i in 25000 .. 75000 loop
  3    update t1 set y = i+1 where x = i;
  4    if sql%notfound then insert into t1 values (i,i+1); end if;
  5  end loop;
  6  end;
  7  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:06.94
SQL>