Databases are slow right?

Posted by

Man, it gets my goat when people start pontificating nonsense on Twitter.

Yeah, I know, I know, I shouldn’t get upset, because 99% of everything on social media is nonsense anyway, but it is when people roll out claims about database tech that I tend to get tetchy Smile

Today it’s that familiar old chestnut: “Constraints in the database make it slow“. And…this one even came with some numbers add some apparent weight!

image

Now truth be told, I’m not sure what “QPS” is a reference to here, because if it is “queries per second”, then the number of foreign keys and check constraint is not going to impact the rate at which you can retrieve data from the database. If anything, it might make retrieval faster because you are giving the optimizer more information.

So lets assume we’re talking about transactions per second here. Let’s explore this claim with a real demo. I’ll create a table to hold some sales transactions, and each sale will have a number of attributes all relating back to parent tables.



--
-- Some parent tables
--
SQL> create table product as select rownum pid, 'product '||rownum descr
  2  from dual connect by level <= 100000;

Table created.

SQL> alter table product add primary key ( pid) ;

Table altered.

SQL>
SQL> create table customer as select rownum cid, 'customer '||rownum descr
  2  from dual connect by level <= 100000;

Table created.

SQL> alter table customer add primary key ( cid) ;

Table altered.

SQL>
SQL> create table store as select rownum sid, 'store '||rownum descr
  2  from dual connect by level <= 100000;

Table created.

SQL> alter table store add primary key ( sid) ;

Table altered.

SQL>
SQL> create table taxcode as select rownum tid, 'taxcode '||rownum descr
  2  from dual connect by level <= 100000;

Table created.

SQL> alter table taxcode add primary key ( tid) ;

Table altered.

--
-- And my sales table that refers back to them
--
SQL> create table sales
  2  ( txn_id int default txnseq.nextval not null ,
  3    pid int not null references product(pid),
  4    cid int not null references customer(cid),
  5    sid int not null references store(sid),
  6    tid int not null references taxcode(tid),
  7    discount varchar2(1) not null,
  8    amount number(10,2) not null,
  9    qty   number(10) not null,
 10    constraint chk1 check ( discount in ('Y','N')),
 11    constraint chk2 check ( amount > 0),
 12    constraint chk3 check ( qty > 0)
 13  );

Table created.

SQL>
SQL> alter table sales add primary key ( txn_id )
  2  using index global partition by hash ( txn_id ) partitions 8;

Table altered.

By my count, I’ve got

  • 8 constraints to check for null-ness
  • 3 check value constraints
  • 1 primary key constraint
  • 4 foreign key constraints

Wow! This thing is constrained to kingdom come! I really should not have bothered proceeding any further because surely that many constraints makes this database just totally unusable for anything.  (Yes, I know sarcasm is a sign of poor character…but well, I never claimed to have good character Smile)

Anyway, I’ll persist on what must surely be a hopeless cause. I’ll create a little routine that picks products, stores, customers and tax codes at random, and uses that values to hammer away transactions at my SALES table.



SQL> create table hammer_log ( job int, ela interval day to second );

Table created.

SQL>
SQL> create or replace
  2  procedure hammer_time is
  3    p sys.odcinumberlist := sys.odcinumberlist();
  4    c sys.odcinumberlist := sys.odcinumberlist();
  5    s sys.odcinumberlist := sys.odcinumberlist();
  6    t sys.odcinumberlist := sys.odcinumberlist();
  7
  8    timer timestamp;
  9  begin
 10    select trunc(dbms_random.value(1,100000))
 11    bulk collect into p
 12    from dual connect by level <= 32000;
 13
 14    select trunc(dbms_random.value(1,100000))
 15    bulk collect into c
 16    from dual connect by level <= 32000;
 17
 18    select trunc(dbms_random.value(1,100000))
 19    bulk collect into s
 20    from dual connect by level <= 32000;
 21
 22    select trunc(dbms_random.value(1,100000))
 23    bulk collect into t
 24    from dual connect by level <= 32000;
 25
 26    timer := localtimestamp;
 27    for j in 1 .. 3 loop
 28
 29      for i in 1 .. 32000
 30      loop
 31        insert into sales ( pid,cid,sid,tid,discount,amount,qty)
 32        values (p(i), c(i), s(i), t(i), 'Y',i,i);
 33        commit;
 34      end loop;
 35    end loop;
 36
 37    insert into hammer_log values ( sys_context('USERENV','BG_JOB_ID'), localtimestamp-timer);
 38    commit;
 39  end;
 40  /

Procedure created.

Then I’ll use the scheduler to simulate a connection pool throwing transactions at my table. We’ll record elapsed time for 96,000 transactions per session in our HAMMER_LOG table and see how much this poor suffering bucket of constraints performs.



SQL> declare
  2    j int;
  3  begin
  4    for i in 1 .. 12
  5    loop
  6       dbms_job.submit(j,'hammer_time;');
  7    end loop;
  8    commit;
  9  end;
 10  /

PL/SQL procedure successfully completed.


SQL> select * from hammer_log;

       JOB ELA
---------- -----------------------
     80999 +00 00:00:19.013000
     81005 +00 00:00:19.005000
     81000 +00 00:00:19.461000
     80998 +00 00:00:19.036000
     81003 +00 00:00:19.248000
     81004 +00 00:00:18.375000
     80996 +00 00:00:19.306000
     81002 +00 00:00:19.286000
     80997 +00 00:00:18.825000
     80995 +00 00:00:18.987000
     81006 +00 00:00:19.018000
     81001 +00 00:00:19.180000

12 rows selected.

SQL> select count(*)*96000 / extract(second from max(ela))
  2  from hammer_log;

COUNT(*)*96000/EXTRACT(SECONDFROMMAX(ELA))
------------------------------------------
                                59195.3137

Hmmm….not 500 “qps”.  No, not 5000 “qps” either.

In fact, with all that constraint “baggage”, I’m still lobbing nearly 60,000 transactions per second at this table!

I know what you’re thinking …. “He’s using a 48-core Exadata machine”

Well, I didn’t mention that …. because I’m not!

I’m using a 2016 Dell XPS laptop. According to ebay, it’s worth about $600. (That’s not my exact laptop, but its the first one I saw that looked close Smile).

image

But this isn’t about price..this is about putting fictional claims on Twitter about database to rest.

“Hold on!” comes your rebuttal.. “What about indexes?”

Contrary to popular belief, just because you have foreign keys defined on columns on a table does not mean that you must index that column.  Some times you might need to, but they generally are a fairly niche set of circumstances. But I don’t want my readers thinking I’m pulling a fast one here, so let’s take our SALES table and index those foreign keys up the wazoo, and then redo the benchmark.



SQL> create index sales_ix1 on sales ( pid);

Index created.

SQL> create index sales_ix2 on sales ( sid);

Index created.

SQL> create index sales_ix3 on sales ( cid);

Index created.

SQL> create index sales_ix4 on sales ( tid);

Index created.
[re-run the 12 jobs]
SQL> select count(*)*96000 / extract(second from max(ela))
  2  from hammer_log;

COUNT(*)*96000/EXTRACT(SECONDFROMMAX(ELA))
------------------------------------------
                                36080.5342

1 row selected.

What a bummer…Only 36,000 transactions per second now on my 4 year old laptop.

So yes, I will fully concede that

  • if you’ve chosen not to educate yourself on foreign key indexing, AND
  • you bought your database server off ebay from an aussie trying to offload a 4 year old laptop, AND
  • you have constraints everywhere, AND
  • you need more than 36,000 transactions per second…

then yes, you might need to look at options. But probably the first option to explore is not to jump onto Twitter and claim that databases don’t scale.

Case closed.

5 comments

  1. Hi,
    nice one, btw same code (with indexes) free tier AD
    COUNT(*)*96000/EXTRACT(SECONDFROMMAX(ELA))
    ——————————————
    32852.6935
    Regards.
    G

    1. Just last week: SOA developer. I don’t want check constraints as it makes my SOA program complex, as records would fail and error handling makes it complex. Where do they find these people?

  2. Just tried the same code on ATP on Free Tier and getting 19,238 for before and after indexes are added.
    Not bad for something that is free, no hardware, no license, etc.

    [BTW missing create sequence statement before the create table]

Got some thoughts? Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.