DDL for constraints – subtle things

The DBMS_METADATA package is very cool. I remember the days of either hand-crafting DDL statements based on queries to the data dictionary, or many a DBA will be familiar with running “imp show=y” or “imp indexfile=…” in order to then laboriously extract the DDL required from the import log file.  DBMS_METADATA removed all of those annoyances to give us a simple API to get the true and complete DDL for a database object.

But when extracting DDL from the database using the DBMS_METADATA package, you need to be aware of some subtleties especially if you plan on executing that DDL in the database.

Consider this example – I have a few tables and I want to extract the referential integrity constraints for one of them. Being a good cautious developer Smile I’ll just output the DDL first before attempt to do any execution of the statements:


SQL> create table tab1(id number, name varchar2(100),
  2                      constraint pk_tab1_id primary key(id));

Table created.

SQL> create table tab2(id number, name varchar2(100),
  2                      constraint pk_tab2_id primary key(id));

Table created.

SQL> create table tab3(id number, name varchar2(100), int_id number,
  2                      constraint pk_tab3_id primary key(id),
  3                      constraint fk_tab1_id foreign key(int_id) references tab1(id),
  4                      constraint fk_tab2_id foreign key(int_id) references tab2(id));

Table created.

SQL>
SQL> set serverout on
SQL> begin
  2    for i in (
  3      select t.table_name,
  4          dbms_metadata.get_dependent_ddl('REF_CONSTRAINT', table_name) ddl
  5      from   user_tables t
  6      where  table_name = 'TAB3'
  7      )
  8    loop
  9            dbms_output.put_line(i.ddl);
 10            --execute immediate i.ddl;
 11    end loop;
 12  end;
 13  /

  ALTER TABLE "MCDONAC"."TAB3" ADD CONSTRAINT "FK_TAB1_ID" FOREIGN KEY ("INT_ID") REFERENCES "MCDONAC"."TAB1" ("ID") ENABLE

ALTER TABLE "MCDONAC"."TAB3" ADD CONSTRAINT "FK_TAB2_ID" FOREIGN KEY ("INT_ID") REFERENCES "MCDONAC"."TAB2" ("ID") ENABLE

PL/SQL procedure successfully completed.

That all looks fine – I have my two ALTER statements ready to go.  So now I’ll comment back in the ‘execute immediate’ command and all should be fine.


SQL> set serverout on
SQL> begin
  2    for i in (
  3      select t.table_name,
  4             dbms_metadata.get_dependent_ddl('REF_CONSTRAINT', table_name) ddl
  5      from   user_tables t
  6      where  table_name = 'TAB3'
  7      )
  8    loop
  9            dbms_output.put_line(i.ddl);
 10            execute immediate i.ddl;
 11    end loop;
 12  end;
 13  /

  ALTER TABLE "MCDONAC"."TAB3" ADD CONSTRAINT "FK_TAB1_ID" FOREIGN KEY ("INT_ID") REFERENCES "MCDONAC"."TAB1" ("ID") ENABLE

ALTER TABLE "MCDONAC"."TAB3" ADD CONSTRAINT "FK_TAB2_ID" FOREIGN KEY ("INT_ID") REFERENCES "MCDONAC"."TAB2" ("ID") ENABLE
begin
*
ERROR at line 1:
ORA-01735: invalid ALTER TABLE option
ORA-06512: at line 10
ORA-06512: at line 10

The seems an odd result. Since in this simple example I’m just running the commands straight back into the same database, I might have expected a “Constraint already exists”, or “Object name exists” style of error, but this is different. This error is telling that the statement is invalid – which obviously should not be the case if it came straight out of DBMS_METADATA. But a simple amendment to my anonymous block will reveal the answer. I will output a line of dashes each time I cycle through the cursor loop



SQL> set serverout on
SQL> begin
  2    for i in (
  3      select t.table_name,
  4             dbms_metadata.get_dependent_ddl('REF_CONSTRAINT', table_name) ddl
  5      from   user_tables t
  6      where  table_name = 'TAB3'
  7      )
  8    loop
  9            dbms_output.put_line('========================================');
 10            dbms_output.put_line(i.ddl);
 11            --execute immediate i.ddl;
 12    end loop;
 13  end;
 14  /
========================================

  ALTER TABLE "MCDONAC"."TAB3" ADD CONSTRAINT "FK_TAB1_ID" FOREIGN KEY ("INT_ID") REFERENCES "MCDONAC"."TAB1" ("ID") ENABLE

ALTER TABLE "MCDONAC"."TAB3" ADD CONSTRAINT "FK_TAB2_ID" FOREIGN KEY ("INT_ID") REFERENCES "MCDONAC"."TAB2" ("ID") ENABLE

PL/SQL procedure successfully completed.

And therein lies the issue. I got two ALTER commands back on screen, but in reality they both came back from a single row fetched from the cursor. The ALTER commands were separated by a carriage return, but if I try to execute that single row, then the statement is invalid because it is an attempt to run two ALTER commands in a single statement. The output looks like two rows from the cursor but it was not.

That is a problem if I want to store that DDL in a table or a file, because I can’t use it as it currently stands, and I don’t want to have to write some scripts to parse that DDL to add semi-colons or split it into multiple commands, because one of the motivations for DBMS_METADATA in the first place was to avoid all that irritation.

There is an easy fix to this. Rather than getting the dependent DDL for a table, we can get the “direct” DDL for the constraints themselves.  In this way, you’ll get a row from the cursor for each constraint, and hence one DDL statement per constraint as well.



SQL> set serverout on
SQL> begin
  2    for i in (
  3         select t.table_name,
  4                dbms_metadata.get_ddl('REF_CONSTRAINT', constraint_name) ddl
  5         from user_constraints t
  6         where table_name = 'TAB3'
  7         and constraint_type = 'R'
  8      )
  9    loop
 10            dbms_output.put_line('========================================');
 11            dbms_output.put_line(i.ddl);
 12            --execute immediate i.ddl;
 13    end loop;
 14  end;
 15  /
========================================

  ALTER TABLE "MCDONAC"."TAB3" ADD CONSTRAINT "FK_TAB1_ID" FOREIGN KEY ("INT_ID") REFERENCES "MCDONAC"."TAB1" ("ID") ENABLE
========================================

  ALTER TABLE "MCDONAC"."TAB3" ADD CONSTRAINT "FK_TAB2_ID" FOREIGN KEY ("INT_ID") REFERENCES "MCDONAC"."TAB2" ("ID") ENABLE

PL/SQL procedure successfully completed.

The simplest things….can be risky

Java and Oracle expert Lukas Eder tweeted yesterday about a potential optimization that could be done when reviewing database SQL code.

image

This looks to be a logical thing to do.  Why scan the table T twice to perform an update, when the same job could be done in a single pass.  The benefits seem obvious:

  • less I/O work
  • less time the data is spent locked
  • less risk of an error between the two operations

so don’t get me wrong – the consolidation is going to be a good thing in the majority of cases

And therein lies the rub – the “majority” of cases is the not the same as “all” cases, and that is why I don’t think a tool should ever automatically perform this change. I’d be cool with a tool making a recommendation but let’s see why you cannot just assume that the consolidation is correct.

Here’s our table with a single row and single business rule implement with a check constraint



SQL> create table t ( a int, b int );

Table created.

SQL> alter table t add constraint chk check ( a < b ) ;

Table altered.

SQL> insert into t values (1,2);

1 row created.

SQL> commit;

Commit complete.

Now I’ll implement the application in the original “unoptimized” way:


SQL> update t set a = a + 1;
update t set a = a + 1
*
ERROR at line 1:
ORA-02290: check constraint (MCDONAC.CHK) violated


SQL> update t set b = b*2;

1 row updated.

You can see that the first update failed – it violated the check constraint. Of course, it is not definitively clear whether this should be the case based on the business requirements, because I haven’t elaborated on whether these two updates should be two transactions or a single transaction. The correctness is not really the point I’m trying to make here, but that if I now choose to consolidate the update, I end up with a different application behaviour.

I’ll roll back the change above, and repeat the experiment using the consolidate update:


SQL> roll;
Rollback complete.
SQL> update t set a = a + 1, b = b*2;

1 row updated.

This time the update completes successfully. If a tool had automatically done this, then I will get a different behaviour in my application. That might be a good thing..it might not be. I could eliminate the difference by implementing the constraint in a DEFERRED usage, but we’re starting to depart even further from the existing implementation of the application code, which means more scrutiny and more regression testing.

So by all means, explore opportunities to improve the performance of your SQL by re-arranging it, consolidating it, and aiming to get more done with less work. But be careful that you do not unknowingly change the way your application works when you do so.

 

NULL’s vs NOT NULL’s and Performance

When it comes to giving the cost based optimiser the best possible chance to make the “right” decisions, many DBA’s are diligent in keeping statistics up to date, using histograms where appropriate, creating more indexes (or removing surplus indexes).

However one often neglected area is that the the null-ness of columns also impacts the optimiser decisions. NULL and NOT NULL do more than just act as constraints, they also add (or detract) to the value of indexes on those columns. Here’s an example of how the null-ness of a column impacts optimizer decisions. I have a table T which is a copy of DBA_OBJECTS, indexed on OBJECT_ID.


SQL> create table t as
  2  select * from dba_objects
  3  where object_id is not null;

Table created.

SQL>
SQL> create index IX on T ( object_id );

Index created.

Now I want to perform a standard pagination style query, namely, get the first 5 rows in order of OBJECT_ID


SQL>
SQL> set autotrace traceonly
SQL> select *
  2  from
  3    ( select * from t
  4      order by object_id
  5    )
  6  where rownum <= 5;

5 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3299198703

----------------------------------------------------------------------------------------
| Id  | Operation               | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |      |     5 |  2405 |       |  2755   (1)| 00:00:01 |
|*  1 |  COUNT STOPKEY          |      |       |       |       |            |          |
|   2 |   VIEW                  |      | 78750 |    36M|       |  2755   (1)| 00:00:01 |
|*  3 |    SORT ORDER BY STOPKEY|      | 78750 |     9M|    14M|  2755   (1)| 00:00:01 |
|   4 |     TABLE ACCESS FULL   | T    | 78750 |     9M|       |   428   (1)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=5)
   3 - filter(ROWNUM<=5)


Statistics
----------------------------------------------------------
         52  recursive calls
        130  db block gets
       1591  consistent gets
          0  physical reads
      25420  redo size
       2735  bytes sent via SQL*Net to client
        607  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          5  rows processed

I scanned the entire table and burned around 1500 consistent gets. Now I’ll let the database know what I already know, that is, that the OBJECT_ID column is contains no nulls. Then I’ll repeat the same query.


SQL>
SQL> alter table T modify object_id not null;

Table altered.

SQL>
SQL> select *
  2  from
  3    ( select * from t
  4      order by object_id
  5    )
  6  where rownum <= 5;

5 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3114946973

--------------------------------------------------------------------------------------
| Id  | Operation                     | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |      |     5 |  2405 |     3   (0)| 00:00:01 |
|*  1 |  COUNT STOPKEY                |      |       |       |            |          |
|   2 |   VIEW                        |      |     5 |  2405 |     3   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| T    | 78750 |     9M|     3   (0)| 00:00:01 |
|   4 |     INDEX FULL SCAN           | IX   |     5 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=5)


Statistics
----------------------------------------------------------
         85  recursive calls
        132  db block gets
         72  consistent gets
          1  physical reads
      27192  redo size
       2735  bytes sent via SQL*Net to client
        607  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          5  rows processed

Woo hoo! 20 times more efficient. The extra information we provided to the database allowed for more execution paths to be considered by the optimizer.

That’s all well and good. By how do we know which columns might be missing an appropriate NOT NULL constraint ?

Well, the following routine might help Smile

The PL/SQL procedure below deduces a ‘number of rows that are null’ count for all columns that are indexed for all tables in the schema (passed as parameter P_SCHEMA), although you can pass a table name to restrict the procedure to just that table. For any column that contains no nulls, the appropriate ‘alter table … ( column_name not null)’ command will be outputted. (Make sure you set serveroutput on).

Notes:

  • The client for whom I wrote this script, had no NOT NULL constraints on any table so the procedure only takes a rough stab at ALL_CONSTRAINTS for existing constraints.
  • It tries to keep things sensible – avoiding complex data types, tables that are auto-generated or used for queues etc, but there’s no guarantee it won’t either miss a table, or pick up it should not.
  • This is a brute force approach – it works by scanning every table in the schema, so use your discretion as to when would be a suitable time for running this routine.  But it will only scan each table once to determine the null count for all candidate columns.
  • (As with any diagnosis script), you should not apply it’s recommendations without some careful thought first.

12.2 version


create or replace 
procedure check_indexed_columns_for_null(
                 p_schema varchar2, 
                 p_table_name varchar2 default null) is
  cursor x is
   select 
     table_name,
     column_name,
     count(*) over ( partition by table_name ) as colcount
     from
     (
       select 
         table_name,
         column_name,
         min(existing_constraint)
       from 
       (
       select  
           a.table_name, 
           a.column_name, 
           ( select  count(*)
             from    all_constraints x,
                     all_cons_columns cc
             where   x.owner = c.owner
             and     x.table_name = c.table_name
             and     cc.owner      = x.owner
             and     cc.constraint_name = x.constraint_name
             and     
            (
               ( x.constraint_type = 'C'  and replace(search_condition_vc,'"') = a.column_name||' IS NOT NULL' )
                   or 
               ( x.constraint_type = 'P'  and cc.column_name = a.column_name )
             )
            ) existing_constraint
       from 
         all_ind_columns a,
         all_tables c,
         all_tab_columns ccol
       where a.index_owner = p_schema
       and a.index_owner = p_schema
       and a.table_name = nvl(upper(p_table_name),a.table_name)
       and c.table_name = a.table_name
       and c.owner      = a.table_owner
       and c.owner      = ccol.owner
       and c.table_name = ccol.table_name
       and a.column_name = ccol.column_name
       and c.secondary = 'N'
       and c.temporary = 'N'
       and c.nested    = 'NO'
       and c.external  = 'NO'
       and ccol.data_type_owner is null
       and ccol.data_type not in ('LONG','LONG RAW','CLOB','UROWID','UNDEFINED','NCLOB','BLOB','BFILE','ROWID')
       and (c.owner,c.table_name) not in ( select owner, queue_table from all_queue_tables where owner = p_schema )
       )
       group by 
         table_name,
         column_name
       having min(existing_constraint) = 0
     );

  str0 varchar2(32767); 
  str1 varchar2(32767); 
  str2 varchar2(32767); 
  str3 varchar2(32767);

  prev varchar2(100) := '*';
  cnt number;
  trailer varchar2(5);

procedure do_sql(thesql varchar2) is
  tcursor integer;
  dummy integer;
begin
  -- dbms_output.put_line(thesql);
  execute immediate thesql;
end;

begin
for i in x loop
  if prev != i.table_name then
    str0 := 'declare ';
    str1 := 'begin select '; str2 := ' into ';
    str3 := ' '; cnt := 1;
  end if;
  if cnt = i.colcount then 
    trailer := ' ';
  else
    trailer := ','||chr(10);
  end if;
  str0 := str0 || 'v'||ltrim(cnt)||' number;';
  str1 := str1 || 'sum(decode('||i.column_name||',null,1,0))'||trailer;
  str2 := str2 || 'v'||ltrim(cnt)||trailer;
  str3 := str3 || 'if v'||ltrim(cnt)||' = 0 then '||
    'dbms_output.put_line(''alter table '||p_schema||'.'||i.table_name||
    ' modify ('||i.column_name||' not null);''); end if;'||chr(10);
  if cnt = i.colcount then
    str2 := str2 ||' from '||p_schema||'.'||i.table_name||';';
    str3 := str3 ||' end;';
    do_sql(str0||' '||str1||' '||str2||' '||str3);
  end if;
  prev := i.table_name;
  cnt := cnt + 1;
end loop;
end;
/
sho err

set serverout on
exec check_indexed_columns_for_null('SCOTT')

11.2 version

(Needs a direct CREATE TABLE grant to the owner of the procedure)


create or replace
procedure check_indexed_columns_for_null(
                 p_schema varchar2,
                 p_table_name varchar2 default null) is
  cursor c_template is
    select table_name, column_name, 0 colcount
    from   all_tab_columns;

  type rowlist is table of c_template%rowtype;  
  r    rowlist;
  rc   sys_refcursor;

  str0 varchar2(32767);
  str1 varchar2(32767);
  str2 varchar2(32767);
  str3 varchar2(32767);

  prev varchar2(100) := '*';
  cnt number;
  trailer varchar2(5);

procedure do_sql(thesql varchar2) is
  tcursor integer;
  dummy integer;
begin
  dbms_output.put_line(thesql);
  execute immediate thesql;
end;

begin

  begin
    execute immediate 'drop table tmp$cons purge';
  exception when others then null;
  end;

  execute immediate 'create table tmp$cons as  select owner, table_name, constraint_name, constraint_type, to_lob(search_condition) search_condition_vc '||
                    'from all_constraints';

  open rc for
    q'{select
      table_name,
      column_name,
      count(*) over ( partition by table_name ) as colcount
      from
      (
        select
          table_name,
          column_name,
          min(existing_constraint)
        from
        (
        select
            a.table_name,
            a.column_name,
            ( select  count(*)
              from    ( select owner, table_name, constraint_name, constraint_type, cast(search_condition_vc as varchar2(4000)) search_condition_vc 
                        from tmp$cons ) x,
                      all_cons_columns cc
              where   x.owner = c.owner
              and     x.table_name = c.table_name
              and     cc.owner      = x.owner
              and     cc.constraint_name = x.constraint_name
              and
             (
                ( x.constraint_type = 'C'  and replace(search_condition_vc,'"') = a.column_name||' IS NOT NULL' )
                    or
                ( x.constraint_type = 'P'  and cc.column_name = a.column_name )
              )
             ) existing_constraint
        from
          all_ind_columns a,
          all_tables c,
          all_tab_columns ccol
        where a.index_owner = :p_schema
        and a.index_owner = :p_schema
        and a.table_name = nvl(upper(:p_table_name),a.table_name)
        and c.table_name = a.table_name
        and c.owner      = a.table_owner
        and c.owner      = ccol.owner
        and c.table_name = ccol.table_name
        and a.column_name = ccol.column_name
        and c.secondary = 'N'
        and c.temporary = 'N'
        and c.nested    = 'NO'
        and (c.owner,c.table_name) not in ( select owner, table_name from all_external_tables where owner = :p_schema )
        and ccol.data_type_owner is null
        and ccol.data_type not in ('LONG','LONG RAW','CLOB','UROWID','UNDEFINED','NCLOB','BLOB','BFILE','ROWID')
        and (c.owner,c.table_name) not in ( select owner, queue_table from all_queue_tables where owner = :p_schema )
        )
        group by
          table_name,
          column_name
        having min(existing_constraint) = 0
      )
      }' using p_schema,p_schema,p_table_name,p_schema,p_schema;

  fetch rc bulk collect into r;
  close rc;
  
  for i in 1 .. r.count loop
    if prev != r(i).table_name then
      str0 := 'declare ';
      str1 := 'begin select '; str2 := ' into ';
      str3 := ' '; cnt := 1;
    end if;
    if cnt = r(i).colcount then
      trailer := ' ';
    else
      trailer := ','||chr(10);
    end if;
    str0 := str0 || 'v'||ltrim(cnt)||' number;';
    str1 := str1 || 'sum(decode('||r(i).column_name||',null,1,0))'||trailer;
    str2 := str2 || 'v'||ltrim(cnt)||trailer;
    str3 := str3 || 'if v'||ltrim(cnt)||' = 0 then '||
      'dbms_output.put_line(''alter table '||p_schema||'.'||r(i).table_name||
      ' modify ('||r(i).column_name||' not null);''); end if;'||chr(10);
    if cnt = r(i).colcount then
      str2 := str2 ||' from '||p_schema||'.'||r(i).table_name||';';
      str3 := str3 ||' end;';
      do_sql(str0||' '||str1||' '||str2||' '||str3);
    end if;
    prev := r(i).table_name;
    cnt := cnt + 1;
  end loop;
end;
/

 

If you want to see the underlying table scan queries that are being run, simply comment back in the “dbms_output.put_line” in the DO_SQL subroutine.

Enjoy!

Please keep your foreign keys

I came across an interesting blog post the other day about whether databases should be (declaratively) enforcing the foreign key relationships between tables.  The blog post discussed the issue of foreign keys being temporarily disabled to perform data loading, and then encountering the problem of what to do when those foreign keys cannot be re-enabled due to bad data.  Perhaps they should just be omitted altogether ?  I don’t want to put words in the author’s mouth, because he stressed he was not taking sides in the “should we” or “shouldn’t we” debate on declarative foreign keys, but the concluding part of the blog was:

image

I find that is a bit like saying:

“Well, seat belts are compulsory but people are still getting injured in car accidents, so you may as well just not bother with seat belts at all”

So here’s some information from the other perspective – why constraints are so important for your database in terms of both data correctness and performance.

Firstly here is an AskTOM magazine article that demonstrates the benefits of declarative constraints to both the optimizer, and to the availability of technologies such as materialized view rewrite.

Also, to respond to the particular topic in the blog post about handling data loads, here is a video from the Real World Performance group about how you can validate massive sets of data for referential integrity, and then still apply the constraints efficiently to give the optimizer all those benefits for subsequent query processing.

But if all of that seems to complicated to read and digest, perhaps an easier way of understanding the importance of constraints is to watch my simple video about Australian wildlife Smile

Duplicate constraints are impossible right ?

Here’s a very simple example of a table called PARENT being a (surprise surprise) parent in a referential integrity relationship to a (drum roll for my choice of name) CHILD table


SQL> create table parent ( p int, constraint PAR_PK primary key (p) );

Table created.

SQL> create table child ( c int,
  2        p int
  3      );

Table created.

SQL> alter table child add constraint fk1 foreign key ( p ) references parent ( p );

Table altered.

That is all as we would expect, and similarly, if I inadvertently try to add the same foreign key constraint, I’ll get an error


SQL> alter table child add constraint fk2 foreign key ( p ) references parent ( p );
alter table child add constraint fk2 foreign key ( p ) references parent ( p )
                                     *
ERROR at line 1:
ORA-02275: such a referential constraint already exists in the table

So you might be wondering, how on earth did I get myself into the following calamity on my database:


SQL> select c.constraint_name, cc.column_name, c.r_constraint_name
  2  from   user_constraints c,
  3         user_cons_columns cc
  4  where  c.table_name = 'CHILD'
  5  and    c.constraint_type = 'R'
  6  and    c.constraint_name = cc.constraint_name;

CONSTRAINT_NAME                COLUMN_NAME                    R_CONSTRAINT_NAME
------------------------------ ------------------------------ --------------------
FK1                            P                              PAR_PK
FK2                            P                              PAR_PK

Yes – thats TWO foreign key constraints implementing the identical check.  How did I bypass the duplicate check we saw above ?

It’s just a little idiosyncrasy in all versions of Oracle since the inception of foreign key constraints. If you specify the constraints as part of the table creation DDL, the check for duplicates is skipped.


SQL> create table child ( c int,
  2    p int,
  3    constraint fk1 foreign key ( p ) references parent ( p ) ,
  4    constraint fk2 foreign key ( p ) references parent ( p )
  5  );

Table created.

It is worth looking for and rectifying this condition on your databases, because it can lead to confusing errors, for example, when you do a datapump of such a schema – the import will (correctly) fail on the second constraint, but you’ll be scratching your head when you go look at the imported result and see that the constraint is there !

This is fixed in 12c onwards.


SQL> create table child ( c int,
  2    p int,
  3    constraint fk1 foreign key ( p ) references parent ( p ) ,
  4    constraint fk2 foreign key ( p ) references parent ( p )
  5  );
  constraint fk2 foreign key ( p ) references parent ( p )
                 *
ERROR at line 4:
ORA-02274: duplicate referential constraint specifications

Tightening up your data model

Having NOT NULL constraints declared on columns that are genuinely not nullable not only is good practice, but can yield performance gains. Here’s a little routine that lists those columns that probably need a NOT NULL constraint. It looks at all columns and

  • check to see if a constraint with definition “COLUMN_NAME” IS NOT NULL is present
  • counts the null values

If the latter is zero, and no constraint was found, then we print out ‘alter table … ‘ DDL for those columns that may need such a constraint defined. It works across a nominated schema and/or single table.


SQL> create or replace
  2  procedure check_uncons_columns_for_null(
  3                   p_schema varchar2,
  4                   p_table_name varchar2 default null) is
  5    cursor c_list is
  6     select t.table_name, t.column_name,
  7            max(t.column_name) over (partition by t.table_name  ) as lastcol
  8     from (
  9       select a.table_name, a.column_name
 10       from dba_tab_columns a
 11       where a.owner = p_schema
 12       and a.table_name = nvl(upper(p_table_name),a.table_name)
 13       ) t,
 14       (
 15       select a.table_name, b.column_name, a.search_condition
 16       from dba_cons_columns b, dba_constraints a
 17       where a.owner = p_schema
 18       and   a.constraint_type = 'C'
 19       and   a.table_name = nvl(upper(p_table_name),a.table_name)
 20       and   a.table_name = b.table_name
 21       and   a.owner = b.owner
 22       and   a.constraint_name = b.constraint_name
 23       ) c
 24    where t.table_name = c.table_name(+)
 25    and   t.column_name = c.column_name(+)
 26    order by 1,2;
 27
 28    str0 varchar2(32767);
 29    str1 varchar2(32767);
 30    str2 varchar2(32767);
 31    str3 varchar2(32767);
 32
 33    search_cond varchar2(32767);
 34
 35    prev varchar2(100) := '*';
 36    cnt number;
 37    trailer varchar2(5) := ','||chr(10);
 38
 39  procedure do_sql(thesql varchar2) is
 40    tcursor integer;
 41    dummy integer;
 42  begin
 43    tcursor := dbms_sql.open_cursor;
 44    dbms_sql.parse(tcursor,thesql,2);
 45    dummy := dbms_sql.execute(tcursor);
 46    dbms_sql.close_cursor(tcursor);
 47  end;
 48
 49  begin
 50  for i in c_list loop
 51
 52    if prev != i.table_name then
 53      str0 := 'declare ';
 54      str1 := 'begin select ';
 55      str2 := ' into ';
 56      str3 := ' '; cnt := 1;
 57    end if;
 58
 59    --
 60    -- approximation only
 61    --
 62    if search_cond is null or search_cond != '"'||i.column_name||'" IS NOT NULL' then
 63      str0 := str0 || 'v'||ltrim(cnt)||' number;';
 64      str1 := str1 || 'sum(decode('||i.column_name||',null,1,0))'||trailer;
 65      str2 := str2 || 'v'||ltrim(cnt)||trailer;
 66      str3 := str3 || 'if v'||ltrim(cnt)||' = 0 then '||
 67        'dbms_output.put_line(''alter table '||p_schema||'.'||i.table_name||
 68        ' modify ('||i.column_name||' not null);''); end if;'||chr(10);
 69    end if;
 70    if i.column_name = i.lastcol then
 71      str1 := rtrim(str1,trailer);
 72      str2 := rtrim(str2,trailer) ||' from '||p_schema||'.'||i.table_name||';';
 73      str3 := rtrim(str3,trailer) ||' end;';
 74      do_sql(str0||' '||str1||' '||str2||' '||str3);
 75    end if;
 76    prev := i.table_name;
 77    cnt := cnt + 1;
 78
 79
 80  end loop;
 81  end;
 82  /

Procedure created.

SQL> set serverout on
SQL> exec check_uncons_columns_for_null(user)
alter table SCOTT.CHD modify (A not null);
alter table SCOTT.CHD modify (DATA not null);
alter table SCOTT.CHD modify (F not null);
alter table SCOTT.CHD modify (H not null);
alter table SCOTT.CHD modify (W1 not null);
alter table SCOTT.CHD modify (W2 not null);
alter table SCOTT.PAR modify (A not null);
alter table SCOTT.PAR modify (DATA not null);
alter table SCOTT.PAR modify (F not null);
alter table SCOTT.PAR modify (W not null);

PL/SQL procedure successfully completed.

SQL>
SQL>

A justification for referential integrity constraints

Yes, I know what you’re thinking.

“Here we go again.  Another database dude about to get stuck into us telling us about the importance of foreign keys for data integrity”

and yes, I could easily do that.  But you’ve seen that before (and by the way, it’s of course still totally valid – you should have all those integrity constraints defined!)

But here’s another reason, albeit a little tongue in cheek Smile

First I create a simple user with a couple of tables.


SQL> drop user demo1 cascade;

User dropped.

SQL>
SQL> grant create session, create table to demo1 identified by demo1;

Grant succeeded.

SQL>
SQL> alter user demo1 quota unlimited on users;

User altered.

SQL>
SQL> create table demo1.t1 ( x int, y int, constraint pk primary key ( x ) ) ;

Table created.

SQL>
SQL> create table demo1.t2 ( x int, y int) ;

Table created.

SQL>
SQL> alter table demo1.t2 add constraint pk2 primary key ( x );

Table altered.

SQL>
SQL> insert into demo1.t1 values (1,1);

1 row created.

SQL> insert into demo1.t1 values (2,2);

1 row created.

SQL> commit;

Commit complete.


And now, I take a datapump export of that schema


expdp compression=all dumpfile=DATA_PUMP_DIR:demo1.dmp logfile=DATA_PUMP_DIR:demo1.log schemas=demo1

Export: Release 12.1.0.2.0 - Production on Mon Nov 23 08:51:44 2015

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Username: / as sysdba

Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
Starting "SYS"."SYS_EXPORT_SCHEMA_01":  sys/******** AS SYSDBA compression=all dumpfile=DATA_PUMP_DIR:demo1.dmp logfile=DATA_PUMP_DIR:demo1.
log schemas=demo1
Estimate in progress using BLOCKS method...
Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 64 KB
Processing object type SCHEMA_EXPORT/USER
Processing object type SCHEMA_EXPORT/SYSTEM_GRANT
Processing object type SCHEMA_EXPORT/DEFAULT_ROLE
Processing object type SCHEMA_EXPORT/TABLESPACE_QUOTA
Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA
Processing object type SCHEMA_EXPORT/TABLE/TABLE
Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX
Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
Processing object type SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Processing object type SCHEMA_EXPORT/STATISTICS/MARKER
. . exported "DEMO1"."T1"                                4.804 KB       2 rows
. . exported "DEMO1"."T2"                                    0 KB       0 rows
Master table "SYS"."SYS_EXPORT_SCHEMA_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SYS.SYS_EXPORT_SCHEMA_01 is:
  C:\ORACLE\ADMIN\NP12\DPDUMP\DEMO1.DMP
Job "SYS"."SYS_EXPORT_SCHEMA_01" successfully completed at Mon Nov 23 08:52:10 2015 elapsed 0 00:00:23

Now I want to clone that schema to another, but in order to do it as efficiently as possible, I’m going to exclude indexes and constraints (they will be added later).


impdp directory=DATA_PUMP_DIR exclude=index exclude=constraint exclude=ref_constraint exclude=index_statistics exclude=table_statistics content=all dumpfile=demo1.dmp logfile=demo1.implog remap_schema=demo1:demo2 
Import: Release 12.1.0.2.0 - Production on Mon Nov 23 09:05:36 2015

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Username: /as sysdba

Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
ORA-39002: invalid operation
ORA-39168: Object path REF_CONSTRAINT was not found.

Well…this is a disaster!  Because I had no referential integrity constraints, my import failed to find any…and hence could not exclude them!  This is actually documented as expected behaviour (Datapump Import Fails With ORA-39168, Doc ID 430702.1) and is done as a safety mechanism – if the datapump file was corrupt, we don’t want to be half way through importing a file and find bits and pieces misssing, so the checks are done up front and the import abandoned.

So the moral of the story is … make sure you’ve got referential integrity constraints Smile