hash keys … almost unique

Posted by

In the Oracle documentation, the specification for ORA_HASH is:

The optional max_bucket argument determines the maximum bucket value returned by the hash function. You can specify any value between 0 and 4294967295. The default is 4294967295.

So when I had a set of attributes come in, and I was looking for a simple way of mapping them to a simple unique key, ORA_HASH seemed the natural choice. Until this happened…

SQL> create table T (
  2    id varchar2(8),
  3    code varchar2(5),
  4    year varchar2(4)
  5  );

Table created.

SQL> insert into T values ('240301','21806','2004');

1 row created.

SQL> insert into T values ('276083','31808','2010');

1 row created.

SQL> select id, code, year, ora_hash(id||code||year) hash
  2  from t;

ID       CODE  YEAR       HASH
-------- ----- ---- ----------
240301   21806 2004 1018600323
276083   31808 2010 1018600323

My table has only 90,000 rows, so (assuming a roughly equal distribution of hash keys) I had a 0.002% chance of hitting that

Oracle karma 🙂

5 comments

  1. Depending if you want values 2 – 12 – 2004 to have the hash as 21 – 2 – 2004, you probably want separators between the elements being hashed. Don’t think it reduces the chance of an ‘unlucky’ collision, but if you are in a position when you are *only* storing the hash and get this wrong, you’ll be kicking yourself.

  2. Your math is awful. Please look into birthday attack at wikipedia carefully.

    In case the hash function is absolutely equal distributed, your chances to get a collision in 90’000 rows are 61% (!)
    select 1-exp(-power(90000,2)/(2*4294967295)) collision_probability from dual
    => 0.61

    The expected number of rows to get the first collision is just 77200 (select 1.1774*sqrt(4294967295) rows_to_first_collision from dual).

Got some thoughts? Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.