Blockchain tables … the usage model

Posted by

When you hear the word “blockchain” we typically think of the properties associated with tamper-proof information, with terminology such as “immutability”, “cryptographic digests”, “cryptographic signatures” and so forth. Thus when it comes to blockchain tables in the database, it follows that we need a mechanism to store that additional metadata with the rows that are stored in the blockchain table.

This is achieved by using the existing database functionality of hidden columns. If I create a simple blockchain table


SQL>
SQL> CREATE blockchain TABLE bc1 (c1 char(10), c2 char(10))
  2  no drop until 0 days idle no delete until 31 days after insert hashing using "sha2_512" version v1;

Table created.

SQL>
SQL> insert into bc1 values ('x','x');

1 row created.

SQL> select * from bc1;

C1         C2
---------- ----------
x          x

SQL> commit;

Commit complete.

then a DESCRIBE or a query to XXX_TAB_COLUMNS suggests nothing “untoward”.


SQL> desc bc1
 Name                          Null?    Type
 ----------------------------- -------- -------------
 C1                                     CHAR(10)
 C2                                     CHAR(10)

SQL> select column_name,data_type,data_length
  2  from   user_tab_columns
  3  where  table_name = 'BC1';

COLUMN_NAME                    DATA_TYPE                      DATA_LENGTH
------------------------------ ------------------------------ -----------
C1                             CHAR                                    10
C2                             CHAR                                    10

but the true list of columns for the blockchain table (and for that matter, any table with hidden columns) can be seen via USER_TAB_COLS.


SQL> select column_name,data_type,data_length
  2  from   user_tab_cols
  3  where  table_name = 'BC1';

COLUMN_NAME                    DATA_TYPE                      DATA_LENGTH
------------------------------ ------------------------------ -----------
C1                             CHAR                                    10
C2                             CHAR                                    10
ORABCTAB_INST_ID$              NUMBER                                  22
ORABCTAB_CHAIN_ID$             NUMBER                                  22
ORABCTAB_SEQ_NUM$              NUMBER                                  22
ORABCTAB_CREATION_TIME$        TIMESTAMP(6) WITH TIME ZONE             13
ORABCTAB_USER_NUMBER$          NUMBER                                  22
ORABCTAB_HASH$                 RAW                                   2000
ORABCTAB_SIGNATURE$            RAW                                   2000
ORABCTAB_SIGNATURE_ALG$        NUMBER                                  22
ORABCTAB_SIGNATURE_CERT$       RAW                                     16
ORABCTAB_SPARE$                RAW                                   2000

Out of the box, blockchain tables will give perfectly fine performance just like any other heap table in the Oracle database. But like any other table, you might be able to squeeze out more performance by digging into the physical design properties. When it comes to physical design of your schema, the true set of columns is the reference point via which you should be looking to set storage attributes for the table, and when it comes to blockchain tables, there is an additional functional difference you should look to take into account. Lets start with a fresh blockchain table and insert a new row


SQL> CREATE blockchain TABLE bc1 (c1 char(10), c2 char(10))
  2  no drop until 0 days idle no delete until 31 days after insert hashing using "sha2_512" version v1;

Table created.

SQL> insert into bc1 values ('x','y');

1 row created.

Unlike a traditional column (hidden or otherwise) the blockchain metadata columns are not populated:


SQL> select
  2   ORABCTAB_SEQ_NUM$
  3  ,ORABCTAB_SPARE$
  4  ,ORABCTAB_USER_NUMBER$
  5  ,ORABCTAB_HASH$
  6  ,ORABCTAB_SIGNATURE$
  7  ,ORABCTAB_SIGNATURE_ALG$
  8  ,ORABCTAB_SIGNATURE_CERT$
  9  ,ORABCTAB_CHAIN_ID$
 10  ,ORABCTAB_INST_ID$
 11  ,ORABCTAB_CREATION_TIME$
 12  from bc1
 13  @pr
==============================
ORABCTAB_SEQ_NUM$             :
ORABCTAB_SPARE$               :
ORABCTAB_USER_NUMBER$         :
ORABCTAB_HASH$                :
ORABCTAB_SIGNATURE$           :
ORABCTAB_SIGNATURE_ALG$       :
ORABCTAB_SIGNATURE_CERT$      :
ORABCTAB_CHAIN_ID$            :
ORABCTAB_INST_ID$             :
ORABCTAB_CREATION_TIME$       :

PL/SQL procedure successfully completed.

The work to populate the metadata comes at the conclusion of the transaction


SQL> commit;

Commit complete.

SQL> select
  2   ORABCTAB_SEQ_NUM$
  3  ,ORABCTAB_SPARE$
  4  ,ORABCTAB_USER_NUMBER$
  5  ,ORABCTAB_HASH$
  6  ,ORABCTAB_SIGNATURE$
  7  ,ORABCTAB_SIGNATURE_ALG$
  8  ,ORABCTAB_SIGNATURE_CERT$
  9  ,ORABCTAB_CHAIN_ID$
 10  ,ORABCTAB_INST_ID$
 11  ,ORABCTAB_CREATION_TIME$
 12  from bc1
 13  @pr
==============================
ORABCTAB_SEQ_NUM$             : 1
ORABCTAB_SPARE$               :
ORABCTAB_USER_NUMBER$         : 107
ORABCTAB_HASH$                :
0924C163458DE1FA3A7FFEA38ADA1852A409ADBF859A453E83A429BE3DB9F119C28828FFDB7BD101C5695C9CFCD82364F0B933B1E442B3CFCDE037D20CFDB631
ORABCTAB_SIGNATURE$           :
ORABCTAB_SIGNATURE_ALG$       :
ORABCTAB_SIGNATURE_CERT$      :
ORABCTAB_CHAIN_ID$            : 18
ORABCTAB_INST_ID$             : 1
ORABCTAB_CREATION_TIME$       : 30-JUL-21 06.04.12.738166 AM +00:00

PL/SQL procedure successfully completed.

This is something that you will probably never notice because the vast majority of use cases for blockchain tables are the typical OLTP style of activity where transactions are small in duration and high in frequency. However, like all tables in the database, there are benefits to be had if you spend a little effort on physical design. Lets explore that here.

I’ll start with a blockchain table that mirrors the familiar EMP table.


SQL> create blockchain table bc1 (
  2   empno                         number(4)
  3  ,ename                         varchar2(10)
  4  ,job                           varchar2(9)
  5  ,mgr                           number(4)
  6  ,hiredate                      date
  7  ,sal                           number(7,2)
  8  ,comm                          number(7,2)
  9  ,deptno                        number(2)
 10  )
 11  no drop until 0 days idle no delete until 31 days after insert hashing using "sha2_512" version v1;

Table created.

To gauge the typical size of each row in the EMP table, I’ll populate it and gather some stats. In order to keep the numbers simple, I’ll duplicate the same row many times over so every row is the same size.


SQL> create table emp as
  2  select s.* from
  3    ( select * from scott.emp where rownum = 1 ) s,
  4    ( select 1 from dual connect by level <= 180);

Table created.

SQL>
SQL> exec dbms_stats.gather_table_stats('','EMP')

PL/SQL procedure successfully completed.

SQL> select avg_row_len from user_tables
  2  where table_name = 'EMP';

AVG_ROW_LEN
-----------
         34

Keep that number in mind, and also notice that those 180 rows all fit nicely into a single block in the database.


SQL> select dbms_rowid.rowid_block_number(rowid) blk, count(*)
  2  from emp
  3  group by dbms_rowid.rowid_block_number(rowid);

       BLK   COUNT(*)
---------- ----------
       129        180

Now I am going to load 180 rows in the same way from the EMP table into my blockchain table which has the same columns.


SQL> insert into bc1
  2  select s.* from
  3    ( select * from scott.emp where rownum = 1 ) s,
  4    ( select 1 from dual connect by level <= 180);

180 rows created.

As we saw previously, the internal columns are not populated whilst the transaction is still active, but once the commit is performed we now calculate the relevant metadata for each row and store it alongside the base data.


SQL> select count(ORABCTAB_SEQ_NUM$) from bc1;

COUNT(ORABCTAB_SEQ_NUM$)
------------------------
                       0   -- nothing populated

SQL> commit;

Commit complete.

SQL> select count(ORABCTAB_SEQ_NUM$) from bc1;

COUNT(ORABCTAB_SEQ_NUM$)
------------------------
                     180  -- now fully populated

But lets step back and think about the process here. At the time of insert, we loaded 180 rows of size 34 bytes into our blockchain table. Like the EMP table, we would expect this to fit nicely into a single block. But the moment we issued a commit, every one of those rows had to be extended to hold extra metadata. This is synonymous with returning to the rows in a later transaction and updating each of them to be a larger size. Does row migration occur in the normal way? Time to dig a little deeper.


SQL> @?/rdbms/admin/utlchain

Table created.

SQL> analyze table bc1 list chained rows into chained_rows;

Table analyzed.

SQL> select head_rowid from chained_rows;

HEAD_ROWID
------------------
AAAS4KAATAAAAEBAAJ
AAAS4KAATAAAAEBAAK
AAAS4KAATAAAAEBAAM
AAAS4KAATAAAAEBAAN
AAAS4KAATAAAAEBAAO
AAAS4KAATAAAAEBAAQ
...
...
AAAS4KAATAAAAEBACt
AAAS4KAATAAAAEBACv
AAAS4KAATAAAAEBACw
AAAS4KAATAAAAEBACx
AAAS4KAATAAAAEBACz

130 rows selected.

A subset of the rows have now been migrated, which might have some performance overheads for index access to these rows. This is a consequence of having lots of blockchain rows being created in a single transaction (which is not the normal use case here). If I perform the same 180 row insert, but perform them row-at-a-time to mimic a single row transaction application, then because each row gets extended on commit, the probably of row migration drops to zero even though the rows have grown from 34 bytes to 125 bytes during the course of the transaction.


SQL> create blockchain table bc1 (
  2   empno                         number(4)
  3  ,ename                         varchar2(10)
  4  ,job                           varchar2(9)
  5  ,mgr                           number(4)
  6  ,hiredate                      date
  7  ,sal                           number(7,2)
  8  ,comm                          number(7,2)
  9  ,deptno                        number(2)
 10  )
 11  no drop until 0 days idle no delete until 31 days after insert hashing using "sha2_512" version v1;

Table created.

SQL>
SQL> begin
  2  for i in 1 .. 180 loop
  3    insert into bc1
  4    select * from scott.emp where rownum = 1;
  5    commit;
  6  end loop;
  7  end;
  8  /

PL/SQL procedure successfully completed.

SQL> truncate table chained_rows;

Table truncated.

SQL> analyze table bc1 list chained rows into chained_rows;

Table analyzed.

SQL>
SQL> select head_rowid from chained_rows;

no rows selected

SQL> exec dbms_stats.gather_table_stats('','BC1')

PL/SQL procedure successfully completed.

SQL> select avg_row_len from user_tables
  2  where table_name = 'BC1';

AVG_ROW_LEN
-----------
        125


As I mentioned, It is unlikely that INSERT-SELECT is going to be a common use case for blockchain tables, but if you do need to do this (for example, an initial data load), you probably want to make some adjustments to your PCTFREE settings or manipulate the Hakan factor or perform an ALTER TABLE MOVE once the initial load is done. (See the video at the tail of this post).

But it is not just a single session loading 180 rows via INSERT-SELECT that creates the scenario of multiple uncommitted rows sitting in a database block. If you have a high transaction volume application, then multiple sessions all performing a few inserts each can create a similar situation. I ran a benchmark with 60 concurrent sessions all performing 4 inserts before the transaction committed to see if any row migration occurred. From 240 rows, approximately 80 of them ended up being chained.

This is probably an extreme case because the hidden metadata in a large proportion of the total size of the row. (Modern applications typically have a little more data than what you find in the EMP table :-)). Conversely, if you also start to take advantage of the row signing features of blockchain tables, the metadata is larger per row, so I’d recommend always performing some analysis on the data in the table, and the method via which the table will be populated in order to determine an appropriate PCTFREE setting for the blockchain table. Physical design is an often forgotten art and is another tool in your tool chest for getting the peak performance from the Oracle database.

If this all sounds like  jibber-jabber to you, then here is a primer on the PCTFREE setting

and for the more advanced practitioner, some considerations on using the Hakan factor.

2 comments

  1. Hi Connor,

    Just an idea, since I cannot currently check it:

    Couldn’t it be possible to define DEFAULT values for the hidden columns that caused the rows to be migrated,
    for having them “already long enough” during the INSERT itself:
    For example, a default value for ORABCTAB_HASH$ could be something like SYS_GUID() concatenated 4 times,
    for ORABCTAB_HASH$ it could be SYSTIMESTAMP, a.s.o. ?

    Cheers & Best Regards,
    Iudith Mentzel

Got some thoughts? Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.