Do a quick Google search and you’ll find plenty of blog posts about why GUIDs are superior to integers for a unique identifier, and of course, an equal number of posts about why integers are superior to GUIDs. In the Oracle world, most people have been using sequence numbers since they were pretty much the only option available to us in earlier versions. But developers coming from other platforms often prefer GUIDs simply due to their familiarity with them.
I’m pretty much ambivalent when it comes to which one to use. In fact, a good example is the AskTOM database -which had exclusively sequence-based primary keys on inception, but as the database has evolved and developers of different backgrounds and ages have worked on it, there is now of mix of strategies. The older tables have sequence based primary keys, and many of the newer tables have GUIDs as primary keys. Don’t get me wrong – I’m not advocating that you should have a mix – for a given database schema I’d recommend picking one regime and sticking with it. But my point is, that even with the mix of approaches in the AskTOM schema, I’ve never encountered any problems or issues with either.
However, there is one use case where I would strongly recommend using some caution on the use of GUIDs, and that is in the arena of systems that load data in bulk (eg data warehouses).
GUIDs are not cheap to generate. A quick look at the history on the structure and generation of unique IDs at https://en.wikipedia.org/wiki/Universally_unique_identifier all come down to a common component – the need to generate a good random number, and “good” can be a struggle for computers because you need algorithms that ensure sufficient randomness and distribution of the generated numbers. That takes CPU cycles and whilst that is something that you will never notice when using a GUID for the 100 customers in your customer table, it definitely will be noticeable if you are going to attach a GUID to every one of your 10 million daily sales transactions, or telephone records, or similar.
Here’s a simple example where I’ll load 100 million rows into a table. First I’ll try the conventional (when it comes to Oracle) approach of using a sequence number to uniquely identify each row.
SQL> create sequence seq cache 50000;
Sequence created.
SQL>
SQL> create table t1
2 ( pk int,
3 data1 int,
4 data2 varchar2(10),
5 data3 int,
6 data4 varchar2(10)
7 )
8 tablespace demo
9 /
Table created.
SQL>
SQL> set timing on
SQL> insert /*+ APPEND */ into t1
2 select seq.nextval, int_val, char_val, int_val, char_val
3 from
4 ( select level int_val, to_char(level) char_val from dual connect by level <= 10000 ),
5 ( select 1 from dual connect by level <= 10000 );
100000000 rows created.
Elapsed: 00:03:31.42
SQL>
SQL> commit;
Commit complete.
Elapsed: 00:00:00.01
SQL>
3 minutes 30 seconds for 100 million rows is pretty good performance for a laptop, although obviously the table structure here is very simple.
And now I’ll repeat the exercise with the same table structure, but using a raw column to hold the output of a call to SYS_GUID().
SQL>
SQL>
SQL> create table t2
2 ( pk raw(20),
3 data1 int,
4 data2 varchar2(10),
5 data3 int,
6 data4 varchar2(10)
7 )
8 tablespace demo
9 /
Table created.
SQL>
SQL> set timing on
SQL> insert /*+ APPEND */ into t2
2 select sys_guid(), int_val, char_val, int_val, char_val
3 from
4 ( select level int_val, to_char(level) char_val from dual connect by level <= 10000 ),
5 ( select 1 from dual connect by level <= 10000 );
100000000 rows created.
Elapsed: 00:30:56.78
SQL> commit;
Commit complete.
Elapsed: 00:00:00.03
That’s right – we’ve blown out to 30 minutes. As you can see, there can be a large cost when the row volumes (and hence number of calls to generate a GUID) get large. We can even take the INSERT out of the equation here, and simply do a raw stress test to see how many GUIDs we can call from the SQL engine using the following test harness.
SQL> create table t ( sz int, dur interval day to second );
Table created.
SQL>
SQL>
SQL> declare
2 ts_start timestamp;
3 ts_end timestamp;
4 iter int;
5 dummy raw(32);
6 begin
7 for i in 1 .. 8 loop
8 iter := power(10,i);
9
10 ts_start := systimestamp;
11 if iter <= 10000 then
12 select max(x) into dummy from
13 (
14 select sys_guid() x from
15 ( select 1 from dual connect by level <= iter )
16 );
17 else
18 select max(x) into dummy from
19 (
20 select sys_guid() x from
21 ( select 1 from dual connect by level <= iter/10000 ),
22 ( select 1 from dual connect by level <= 10000 )
23 );
24 end if;
25
26 ts_end := systimestamp;
27 insert into t values (iter, ts_end - ts_start );
28 commit;
29
30 end loop;
31 end;
32 /
PL/SQL procedure successfully completed.
SQL>
SQL> select * from t order by 1;
SZ DUR
---------- ----------------------------------------------------------
10 +00 00:00:00.000000
100 +00 00:00:00.000000
1000 +00 00:00:00.015000
10000 +00 00:00:00.172000
100000 +00 00:00:01.607000
1000000 +00 00:00:16.083000
10000000 +00 00:02:49.713000
100000000 +00 00:26:46.570000
I’m not trying to scare you off GUIDs – but like any functionality or feature you’re using to build applications, make sure you test it for business requirements you need to satisfy and make an informed decision on how best to use (or not use) it.
Addenda: Sep 2025
Roman Lemeshko reached out on Linkedin with some new numbers, so I thought I’d update this post with fresh results of the anonymous block above in 2025.
Looks like Linux has had a spruce up – my guess is we’re calling an OS native function
SQL> select * from t;
SZ DUR
---------- --------------------
10 +00 00:00:00.003195
100 +00 00:00:00.000120
1000 +00 00:00:00.000367
10000 +00 00:00:00.002238
100000 +00 00:00:00.011463
1000000 +00 00:00:00.073318
10000000 +00 00:00:00.701534
100000000 +00 00:00:06.823555
But perhaps not so for Windows. The result below from my current desktop which is way faster than the machine I had in 2018
SQL> select * from t;
SZ DUR
---------- --------------------
10 +00 00:00:00.001000
100 +00 00:00:00.002000
1000 +00 00:00:00.022000
10000 +00 00:00:00.214000
100000 +00 00:00:02.145000
1000000 +00 00:00:21.445000
10000000 +00 00:03:37.106000
(I gave up at that point)




Leave a reply to Connor McDonald Cancel reply