Take care with regular expressions

In an Office Hours session a couple of months back, I covered an important change that comes to regular expressions once you upgrade to 12c Release 2. You can see the video covering the issue here:

but for the TL;DR brigade reading this post: Regular expressions are not deterministic when you take NLS settings into account and thus cannot be used in constraints and/or function-based indexes.

This is just a post to quickly revisit the topic for anyone thinking of upgrading from an earlier release to 12c Release 2. An AskTOM question came in asking what would happen to such constraints during the upgrade process.

The sad answer is … nothing. For example, if you successfully (and by strict definition, incorrectly) created a constraint with a regular expression in 11g, then after the upgrade, that constraint will still be present in your new 12c Release 2 system. It will continue to work as it did in 11g, and even if you disable/enable it, or put it through a validate command to exercise the data, it will work as it did before.

To be honest, I do not like this, because it can become what I call a “sleeper” problem. If, 6 months after you upgrade, you happen to drop and recreate that constraint you’ll be most distressed to find that it cannot be added, and you will have most probably long forgotten that it was caused by an event that occurred 6 months ago, namely the upgrade. And perhaps worse, you now have an index or constraint that could potentially be corrupted by innocent tinkering with session NLS settings.

So before you upgrade, definitely take a cursory glance through your constraint definitions and take remedial action if needed.

7 Comments on “Take care with regular expressions

  1. It isn’t just regular expressions that throw up some funnies with NLS settings.
    Having function based indexes with a TO_CHAR over a date can throw a wobbler too.

    create table date_format_test (id number not null, date_val date not null);

    drop index date_format_test_ix1 ;
    create index date_format_test_ix1 on date_format_test (to_char(date_val,’DD-Mon-YYYY’));

    truncate table date_format_test;

    alter session set nls_date_format=’DD-Mon-YYYY’;
    alter session set nls_calendar=’Gregorian’;

    alter session set nls_language = ‘English’;
    insert into date_format_test (id, date_val) values (1, date ‘2018-12-01’);

    alter session set nls_language = ‘French’;
    insert into date_format_test (id, date_val) values (2, date ‘2018-12-01’);

    alter session set nls_language = ‘German’;
    insert into date_format_test (id, date_val) values (3, date ‘2018-12-01′);

    select /*+ INDEX s*/ to_char(date_val,’DD-Mon-YYYY’) from date_format_test s;

    TO_CHAR(DATE_VAL,’DD-MON-YYYY’)
    01-Dec-2018
    01-Dez-2018
    01-Déc. -2018

    Even purely numeric date masks can be non-deterministic if you want to try the Persian nls_calendar setting.

    • True, but in those cases you could/can ensure the determinism with the use of the NLS arguments that the TO_… functions take. There is no such recourse with regex functions currently.

      • Technically, the TO_CHAR can’t be made deterministic because you can only supply the language as an NLS parameter and not the calendar (which can impact the YYYY result) or the territory (which can impact the single digit D format due to disagreement on when the week starts).

        If Oracle go hardline on not allowing non-deterministic expressions, then TO_CHAR (and conversely TO_DATE) would need to be disallowed. I’ll weigh up using only user-defined functions rather than rely on what Oracle may do with built-ins.

        Anyway, thanks for the timely heads up. Got our 12.1 to 12.2 migration next week and was digging through our DBA_TAB_COLS yesterday.

  2. Hi Chris,

    Interesting post and with the explanation provided it actually makes good sense to not allow REGEXP functions in check constraints. Having a few of these in our system, I went on to try it out in our databases. I was a bit surprised that I was allowed to make this table (and check constraint) on 12.2 and 18:

    create table tab1(col1 varchar2(10) check(regexp_like(col1, ‘^[a-zA-Z]+$’)));

    In your video it is said that REGEXP_LIKE is affected as well. If that is the case, why is the above allowed. What am I missing here?

    • Yeah…that’s a bit nasty…

      SQL> create table t ( x varchar2(100),
      2 constraint CHK check ( regexp_like(x,'[^A-Z]’)));

      Table created.

      SQL>
      SQL> alter session set nls_language = ‘AMERICAN’;

      Session altered.

      SQL> insert into t values (‘AAAÖXXX’);

      1 row created.

      SQL> alter session set nls_language = ‘GERMAN’;

      Session wurde geändert.

      SQL> insert into t values (‘AAAÖXXX’);
      insert into t values (‘AAAÖXXX’)
      *
      FEHLER in Zeile 1:
      ORA-02290: CHECK-Constraint (MCDONAC.CHK) verletzt

      • Is there any reason why REGEXP_LIKE is allowed in creation of a check constraint? If I understand your video correct, it should not be allowed to use REGEXP_LIKE.

        Is it an oversight by development, or should we expect that it will be disallowed in future releases?

Got some thoughts? Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: