Man, it gets my goat when people start pontificating nonsense on Twitter.
Yeah, I know, I know, I shouldn’t get upset, because 99% of everything on social media is nonsense anyway, but it is when people roll out claims about database tech that I tend to get tetchy
Today it’s that familiar old chestnut: “Constraints in the database make it slow“. And…this one even came with some numbers add some apparent weight!
Now truth be told, I’m not sure what “QPS” is a reference to here, because if it is “queries per second”, then the number of foreign keys and check constraint is not going to impact the rate at which you can retrieve data from the database. If anything, it might make retrieval faster because you are giving the optimizer more information.
So lets assume we’re talking about transactions per second here. Let’s explore this claim with a real demo. I’ll create a table to hold some sales transactions, and each sale will have a number of attributes all relating back to parent tables.
--
-- Some parent tables
--
SQL> create table product as select rownum pid, 'product '||rownum descr
2 from dual connect by level <= 100000;
Table created.
SQL> alter table product add primary key ( pid) ;
Table altered.
SQL>
SQL> create table customer as select rownum cid, 'customer '||rownum descr
2 from dual connect by level <= 100000;
Table created.
SQL> alter table customer add primary key ( cid) ;
Table altered.
SQL>
SQL> create table store as select rownum sid, 'store '||rownum descr
2 from dual connect by level <= 100000;
Table created.
SQL> alter table store add primary key ( sid) ;
Table altered.
SQL>
SQL> create table taxcode as select rownum tid, 'taxcode '||rownum descr
2 from dual connect by level <= 100000;
Table created.
SQL> alter table taxcode add primary key ( tid) ;
Table altered.
--
-- And my sales table that refers back to them
--
SQL> create table sales
2 ( txn_id int default txnseq.nextval not null ,
3 pid int not null references product(pid),
4 cid int not null references customer(cid),
5 sid int not null references store(sid),
6 tid int not null references taxcode(tid),
7 discount varchar2(1) not null,
8 amount number(10,2) not null,
9 qty number(10) not null,
10 constraint chk1 check ( discount in ('Y','N')),
11 constraint chk2 check ( amount > 0),
12 constraint chk3 check ( qty > 0)
13 );
Table created.
SQL>
SQL> alter table sales add primary key ( txn_id )
2 using index global partition by hash ( txn_id ) partitions 8;
Table altered.
By my count, I’ve got
- 8 constraints to check for null-ness
- 3 check value constraints
- 1 primary key constraint
- 4 foreign key constraints
Wow! This thing is constrained to kingdom come! I really should not have bothered proceeding any further because surely that many constraints makes this database just totally unusable for anything. (Yes, I know sarcasm is a sign of poor character…but well, I never claimed to have good character )
Anyway, I’ll persist on what must surely be a hopeless cause. I’ll create a little routine that picks products, stores, customers and tax codes at random, and uses that values to hammer away transactions at my SALES table.
SQL> create table hammer_log ( job int, ela interval day to second );
Table created.
SQL>
SQL> create or replace
2 procedure hammer_time is
3 p sys.odcinumberlist := sys.odcinumberlist();
4 c sys.odcinumberlist := sys.odcinumberlist();
5 s sys.odcinumberlist := sys.odcinumberlist();
6 t sys.odcinumberlist := sys.odcinumberlist();
7
8 timer timestamp;
9 begin
10 select trunc(dbms_random.value(1,100000))
11 bulk collect into p
12 from dual connect by level <= 32000;
13
14 select trunc(dbms_random.value(1,100000))
15 bulk collect into c
16 from dual connect by level <= 32000;
17
18 select trunc(dbms_random.value(1,100000))
19 bulk collect into s
20 from dual connect by level <= 32000;
21
22 select trunc(dbms_random.value(1,100000))
23 bulk collect into t
24 from dual connect by level <= 32000;
25
26 timer := localtimestamp;
27 for j in 1 .. 3 loop
28
29 for i in 1 .. 32000
30 loop
31 insert into sales ( pid,cid,sid,tid,discount,amount,qty)
32 values (p(i), c(i), s(i), t(i), 'Y',i,i);
33 commit;
34 end loop;
35 end loop;
36
37 insert into hammer_log values ( sys_context('USERENV','BG_JOB_ID'), localtimestamp-timer);
38 commit;
39 end;
40 /
Procedure created.
Then I’ll use the scheduler to simulate a connection pool throwing transactions at my table. We’ll record elapsed time for 96,000 transactions per session in our HAMMER_LOG table and see how much this poor suffering bucket of constraints performs.
SQL> declare
2 j int;
3 begin
4 for i in 1 .. 12
5 loop
6 dbms_job.submit(j,'hammer_time;');
7 end loop;
8 commit;
9 end;
10 /
PL/SQL procedure successfully completed.
SQL> select * from hammer_log;
JOB ELA
---------- -----------------------
80999 +00 00:00:19.013000
81005 +00 00:00:19.005000
81000 +00 00:00:19.461000
80998 +00 00:00:19.036000
81003 +00 00:00:19.248000
81004 +00 00:00:18.375000
80996 +00 00:00:19.306000
81002 +00 00:00:19.286000
80997 +00 00:00:18.825000
80995 +00 00:00:18.987000
81006 +00 00:00:19.018000
81001 +00 00:00:19.180000
12 rows selected.
SQL> select count(*)*96000 / extract(second from max(ela))
2 from hammer_log;
COUNT(*)*96000/EXTRACT(SECONDFROMMAX(ELA))
------------------------------------------
59195.3137
Hmmm….not 500 “qps”. No, not 5000 “qps” either.
In fact, with all that constraint “baggage”, I’m still lobbing nearly 60,000 transactions per second at this table!
I know what you’re thinking …. “He’s using a 48-core Exadata machine”
Well, I didn’t mention that …. because I’m not!
I’m using a 2016 Dell XPS laptop. According to ebay, it’s worth about $600. (That’s not my exact laptop, but its the first one I saw that looked close ).
But this isn’t about price..this is about putting fictional claims on Twitter about database to rest.
“Hold on!” comes your rebuttal.. “What about indexes?”
Contrary to popular belief, just because you have foreign keys defined on columns on a table does not mean that you must index that column. Some times you might need to, but they generally are a fairly niche set of circumstances. But I don’t want my readers thinking I’m pulling a fast one here, so let’s take our SALES table and index those foreign keys up the wazoo, and then redo the benchmark.
SQL> create index sales_ix1 on sales ( pid);
Index created.
SQL> create index sales_ix2 on sales ( sid);
Index created.
SQL> create index sales_ix3 on sales ( cid);
Index created.
SQL> create index sales_ix4 on sales ( tid);
Index created.
[re-run the 12 jobs]
SQL> select count(*)*96000 / extract(second from max(ela))
2 from hammer_log;
COUNT(*)*96000/EXTRACT(SECONDFROMMAX(ELA))
------------------------------------------
36080.5342
1 row selected.
What a bummer…Only 36,000 transactions per second now on my 4 year old laptop.
So yes, I will fully concede that
- if you’ve chosen not to educate yourself on foreign key indexing, AND
- you bought your database server off ebay from an aussie trying to offload a 4 year old laptop, AND
- you have constraints everywhere, AND
- you need more than 36,000 transactions per second…
then yes, you might need to look at options. But probably the first option to explore is not to jump onto Twitter and claim that databases don’t scale.
Case closed.
Hi,
nice one, btw same code (with indexes) free tier AD
COUNT(*)*96000/EXTRACT(SECONDFROMMAX(ELA))
——————————————
32852.6935
Regards.
G
Just last week: SOA developer. I don’t want check constraints as it makes my SOA program complex, as records would fail and error handling makes it complex. Where do they find these people?
Just tried the same code on ATP on Free Tier and getting 19,238 for before and after indexes are added.
Not bad for something that is free, no hardware, no license, etc.
[BTW missing create sequence statement before the create table]
and needed to adjust query to extract the average number of transactions, as my jobs were taking just over a minute
Ya, well that guy likely thinks that NoSQL is the bestest thing evar…
Omg, i’ve typed “oracle transactions per second” and landed on this page – Connor you are a celebrity now. 😀
but i an still wondering how good is AD is at tx processing and scaling – for example is it possible to push towards millions of tx/sec with enough OCPU scaling?
( replying to other comments: AD always free is advertised as having only 1/8 of OCPU )
For millions per second, you’re probably looking at perhaps a more customisable solution, eg https://www.oracle.com/technetwork/database/in-memory/overview/twp-bp-for-iot-with-12c-042017-3679918.html