Before you case-insensitive your whole database!

Posted by

(Familiar with case-insenstive collation already? Skip ahead to the “Side Effects”)

A history of failed applications

Before Oracle 12, a common cause of frustration was how to easily handle the case of character data in your applications. It is day 1 of your application being live, and a simple query on your database suggests that all is going well so far


SQL> select * 
  2  from   customers 
  3  where  cust_name = 'ADAMS';

COUNTRY      CREATED   CUST_NAME
------------ --------- ---------------------
AUS          08-NOV-16 ADAMS

Then a few minutes, the phone starts to ring, and people are complaining that their records are not being displayed, or the screens are crashing with duplicate errors, and your day sounds like it is about to get a whole lot worse. You make a small adjustment to your query and the cause is revealed:


SQL> select * 
  2  from   customers 
  3  where  upper(cust_name) = 'ADAMS';

COUNTRY      CREATED   CUST_NAME
------------ --------- ------------
AUS          07-NOV-16 Adams
AUS          08-NOV-16 ADAMS
AUS          09-NOV-16 adams

Maybe having more than one ADAMS is OK, maybe its not, but the most common scenario is that some of your development team were expecting all the data to be mixed case, and some others in the team were expecting it all to be upper case; the classic “left hand not knowing what the right hand is doing” problem. For the same reason, you’ll probably quickly find that half of the application is running fine, and the other half is not, because of your indexing strategy


SQL> select column_name
  2  from   user_ind_columns
  3  where  index_name = 'CUST_IX';

COLUMN_NAME
------------------------------
CUST_NAME

SQL> select * from customers
  2  where upper(cust_name) = 'ADAMS';

-------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes |
-------------------------------------------------------
|   0 | SELECT STATEMENT  |           |     1 |   152 |
|*  1 |  TABLE ACCESS FULL| CUSTOMERS |     1 |   152 |
-------------------------------------------------------

Rather than refactor great chunks of the application, you’ll probably take the quick fix and double up on indexes on the database. Ugh!


SQL> create index cust_ix 
  2     on customers ( cust_name );

Index created.

SQL> create index cust_ix2
  2    on customers ( upper(cust_name) );

Index created.

Now your query performance is better, but the one thing that always comes back to bite us when we opt for quick fixes to the problems, is not exploring the potential side effects. For example, in our hypothetical application, you’re probably going to get another out of hours phone call from the DBA after you add this index when you discover


SQL> alter table customers shrink space;

*
ERROR at line 1:
ORA-10631: SHRINK clause should not be specified for this object

Yup…no shrink command on a table with a function based index.

In Oracle 12, we introduced better handling of case-insensitivity with the BINARY_CI collation. I recreate my CUSTOMERS table with the new collation clause on the CUSTOMER_NAME column


SQL> CREATE TABLE CUSTOMERS
  2  (
  3    COUNTRY   VARCHAR2(128),
  4    CREATED   DATE,
  5    CUST_NAME VARCHAR2(150) COLLATE BINARY_CI
  6  );

Table created.

And now with a single index, I can get case-insensitive results even through my predicate only nominated one case.


SQL> create index cust_ix
  2    on customers ( cust_name);

Index created.

SQL> set autotrace traceonly explain
SQL> select * from customers
  2  where cust_name = 'ADAMS';

-----------------------------------------------------------------
| Id  | Operation                           | Name      | Rows  |
-----------------------------------------------------------------
|   0 | SELECT STATEMENT                    |           |     1 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| CUSTOMERS |     1 |
|*  2 |   INDEX RANGE SCAN                  | CUST_IX   |     1 |
-----------------------------------------------------------------

SQL> select * from customers
  2  where cust_name = 'ADAMS';

COUNTRY      CREATED   CUST_NAME
------------ --------- ----------------
AUS          07-NOV-16 Adams
AUS          08-NOV-16 ADAMS
AUS          09-NOV-16 adams

Side Effects

Once developers have seen how easy it is to make columns case-insensitive in the new collation functionality, those who have come from platforms where case-insensitivity is the default (I’m looking squarely at you Windows file systems! Smile) are easily tempted to rush out and throw a BINARY_CI as the default at schema level and thus “magically” make your entire application case-insensitive.

Here is why you might want to have a re-think on that approach. In particular, just like the SHRINK SPACE side-effect of function-based indexes, the collation clauses also have side effects that you should be aware of before you head down that path.

For this example, I’ll create a table of employees, but I have drunk the collation Kool-aid and my names are going to be case-insensitive.


SQL> create table CI_EMP (
  2    emp_id      int,
  3    first_name  varchar2(30) collate binary_ci,
  4    last_name   varchar2(30) collate binary_ci,
  5    salary      int,
  6    manager_id  int);

Table created.

I’ll insert some sample data from the standard HR sample schema, and a few queries shows that my case-insensitive queries are working as expected


SQL> insert into CI_EMP
  2  select employee_id, first_name, last_name, salary, manager_id
  3  from   hr.employees;

107 rows created.

SQL>
SQL> select * from CI_EMP where rownum <= 10;

    EMP_ID FIRST_NAME                     LAST_NAME                          SALARY MANAGER_ID
---------- ------------------------------ ------------------------------ ---------- ----------
       100 Steven                         King                                24000
       101 Neena                          Kochhar                             17000        100
       102 Lex                            De Haan                             17000        100
       103 Alexander                      Hunold                               9000        102
       104 Bruce                          Ernst                                6000        103
       105 David                          Austin                               4800        103
       106 Valli                          Pataballa                            4800        103
       107 Diana                          Lorentz                              4200        103
       108 Nancy                          Greenberg                           12008        101
       109 Daniel                         Faviet                               9000        108

10 rows selected.

SQL> select * from CI_EMP where last_name = 'Lorentz';

    EMP_ID FIRST_NAME                     LAST_NAME                          SALARY MANAGER_ID
---------- ------------------------------ ------------------------------ ---------- ----------
       107 Diana                          Lorentz                              4200        103

1 row selected.

SQL> select * from CI_EMP where last_name = 'LORENTZ';

    EMP_ID FIRST_NAME                     LAST_NAME                          SALARY MANAGER_ID
---------- ------------------------------ ------------------------------ ---------- ----------
       107 Diana                          Lorentz                              4200        103

1 row selected.

Now a new application requirement comes in – we’d like a new column called FULL_NAME which is the simple concatenation of the FIRST_NAME and LAST_NAME.

“No problem!” I say. I’ll simply throw together a quick trigger to handle that


SQL> alter table CI_EMP add full_name varchar2(80);

Table altered.

SQL>
SQL> create or replace
  2  trigger CI_EMP_FULL_NAME
  3  before insert on CI_EMP
  4  for each row
  5  begin
  6    :new.full_name := :new.first_name||' '||:new.last_name;
  7  end;
  8  /

Warning: Trigger created with compilation errors.

SQL> sho err
Errors for TRIGGER CI_EMP_FULL_NAME:

LINE/COL ERROR
-------- -----------------------------------------------------------------
2/21     PLS-00049: bad bind variable 'NEW.FIRST_NAME'
2/43     PLS-00049: bad bind variable 'NEW.LAST_NAME'

It can often be the case that our first cut at trigger code results in a compilation error due to our keyboard skills and the like, but the SHOW ERRORS command reveals that there is more to this error. Triggers (and PL/SQL code in general) are not going to play nicely with collations that deviate from the norm.  This is thoroughly covered in the official docs.

image

Workarounds

For this simple requirement of an expression on existing columns, we could do away with the need of the trigger altogether and just use a virtual column:


SQL> alter table CI_EMP drop column full_name;

Table altered.

SQL>
SQL> alter table CI_EMP add full_name varchar2(80)
  2    generated always as ( first_name||' '||last_name );

Table altered.

But in the general case where a trigger contains extensive logic this may not be appropriate. To handle these situations, the key is to remember that PL/SQL is not going to handle the values, and thus you need to be using SQL to perform any logic.  For example, I can use a compound trigger to move the :NEW logic to a SQL statement in the AFTER STATEMENT section


SQL> alter table CI_EMP drop column full_name;

Table altered.

SQL>
SQL> alter table CI_EMP add full_name varchar2(80) ;

Table altered.

SQL>
SQL> drop trigger CI_EMP_FULL_NAME;

Trigger dropped.

SQL>
SQL> create or replace
  2  trigger CI_EMP_FULL_NAME
  3  for insert on CI_EMP
  4  compound trigger
  5
  6    pk sys.odcinumberlist := sys.odcinumberlist();
  7
  8    before each row is
  9    begin
 10      pk.extend;
 11      pk(pk.count) := :new.emp_id;
 12    end before each row;
 13
 14
 15    after statement is
 16    begin
 17      update ci_emp
 18      set    full_name = first_name||' '||last_name
 19      where  emp_id in ( select column_value from table(pk));
 20    end after statement;
 21
 22  end;
 23  /

Trigger created.

SQL>
SQL> insert into ci_emp (emp_id,first_name,last_name,salary,manager_id)
  2  values (1000,'Connor','McDonald',0,100);

1 row created.

SQL>
SQL> select * from ci_emp
  2  where first_name = 'Connor';

    EMP_ID FIRST_NAME                     LAST_NAME                          SALARY MANAGER_ID FULL_NAME
---------- ------------------------------ ------------------------------ ---------- ---------- ----------------------------------------
      1000 Connor                         McDonald                                0        100 Connor McDonald

1 row selected.

but such solutions might not always be possible, in particular, if you need to reference :OLD values.

Key Takeaway

Heading down the path of case-insensitivity might require some changes in how you design your applications in order to bypass the PL/SQL layer for certain components. Note this is a PL/SQL restriction, not a TRIGGER restriction. Many of the cool things we take for granted in PL/SQL may not be available when you start modifying collations for columns.


SQL> declare
  2    myrow ci_emp%rowtype;
  3  begin
  4    select *
  5    into   myrow
  6    from   CI_EMP
  7    where  last_name = 'Lorentz';
  8  end;
  9  /
  myrow ci_emp%rowtype;
        *
ERROR at line 2:
ORA-06550: line 2, column 9:
PLS-00762: %TYPE or %ROWTYPE must have the USING_NLS_COMP collation
ORA-06550: line 0, column 0:
PL/SQL: Compilation unit analysis terminated

As always…test thoroughly.

One comment

  1. Excellent article as always. It’s kind of like when someone tells you to use compress on all your tables, then later you have to add a column to that table…then you learn how to uncompress tables.

    Before getting to your re-written trigger I thought you were going to suggest a virtual column for the full name.

Got some thoughts? Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.