NULL’s vs NOT NULL’s and Performance

When it comes to giving the cost based optimiser the best possible chance to make the “right” decisions, many DBA’s are diligent in keeping statistics up to date, using histograms where appropriate, creating more indexes (or removing surplus indexes).

However one often neglected area is that the the null-ness of columns also impacts the optimiser decisions. NULL and NOT NULL do more than just act as constraints, they also add (or detract) to the value of indexes on those columns. Here’s an example of how the null-ness of a column impacts optimizer decisions. I have a table T which is a copy of DBA_OBJECTS, indexed on OBJECT_ID.


SQL> create table t as
  2  select * from dba_objects
  3  where object_id is not null;

Table created.

SQL>
SQL> create index IX on T ( object_id );

Index created.

Now I want to perform a standard pagination style query, namely, get the first 5 rows in order of OBJECT_ID


SQL>
SQL> set autotrace traceonly
SQL> select *
  2  from
  3    ( select * from t
  4      order by object_id
  5    )
  6  where rownum <= 5;

5 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3299198703

----------------------------------------------------------------------------------------
| Id  | Operation               | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |      |     5 |  2405 |       |  2755   (1)| 00:00:01 |
|*  1 |  COUNT STOPKEY          |      |       |       |       |            |          |
|   2 |   VIEW                  |      | 78750 |    36M|       |  2755   (1)| 00:00:01 |
|*  3 |    SORT ORDER BY STOPKEY|      | 78750 |     9M|    14M|  2755   (1)| 00:00:01 |
|   4 |     TABLE ACCESS FULL   | T    | 78750 |     9M|       |   428   (1)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=5)
   3 - filter(ROWNUM<=5)


Statistics
----------------------------------------------------------
         52  recursive calls
        130  db block gets
       1591  consistent gets
          0  physical reads
      25420  redo size
       2735  bytes sent via SQL*Net to client
        607  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          5  rows processed

I scanned the entire table and burned around 1500 consistent gets. Now I’ll let the database know what I already know, that is, that the OBJECT_ID column is contains no nulls. Then I’ll repeat the same query.


SQL>
SQL> alter table T modify object_id not null;

Table altered.

SQL>
SQL> select *
  2  from
  3    ( select * from t
  4      order by object_id
  5    )
  6  where rownum <= 5;

5 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3114946973

--------------------------------------------------------------------------------------
| Id  | Operation                     | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |      |     5 |  2405 |     3   (0)| 00:00:01 |
|*  1 |  COUNT STOPKEY                |      |       |       |            |          |
|   2 |   VIEW                        |      |     5 |  2405 |     3   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| T    | 78750 |     9M|     3   (0)| 00:00:01 |
|   4 |     INDEX FULL SCAN           | IX   |     5 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=5)


Statistics
----------------------------------------------------------
         85  recursive calls
        132  db block gets
         72  consistent gets
          1  physical reads
      27192  redo size
       2735  bytes sent via SQL*Net to client
        607  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          5  rows processed

Woo hoo! 20 times more efficient. The extra information we provided to the database allowed for more execution paths to be considered by the optimizer.

That’s all well and good. By how do we know which columns might be missing an appropriate NOT NULL constraint ?

Well, the following routine might help Smile

The PL/SQL procedure below deduces a ‘number of rows that are null’ count for all columns that are indexed for all tables in the schema (passed as parameter P_SCHEMA), although you can pass a table name to restrict the procedure to just that table. For any column that contains no nulls, the appropriate ‘alter table … ( column_name not null)’ command will be outputted. (Make sure you set serveroutput on).

Notes:

  • The client for whom I wrote this script, had no NOT NULL constraints on any table so the procedure only takes a rough stab at ALL_CONSTRAINTS for existing constraints.
  • It tries to keep things sensible – avoiding complex data types, tables that are auto-generated or used for queues etc, but there’s no guarantee it won’t either miss a table, or pick up it should not.
  • This is a brute force approach – it works by scanning every table in the schema, so use your discretion as to when would be a suitable time for running this routine.  But it will only scan each table once to determine the null count for all candidate columns.
  • (As with any diagnosis script), you should not apply it’s recommendations without some careful thought first.

12.2 version


create or replace 
procedure check_indexed_columns_for_null(
                 p_schema varchar2, 
                 p_table_name varchar2 default null) is
  cursor x is
   select 
     table_name,
     column_name,
     count(*) over ( partition by table_name ) as colcount
     from
     (
       select 
         table_name,
         column_name,
         min(existing_constraint)
       from 
       (
       select  
           a.table_name, 
           a.column_name, 
           ( select  count(*)
             from    all_constraints x,
                     all_cons_columns cc
             where   x.owner = c.owner
             and     x.table_name = c.table_name
             and     cc.owner      = x.owner
             and     cc.constraint_name = x.constraint_name
             and     
            (
               ( x.constraint_type = 'C'  and replace(search_condition_vc,'"') = a.column_name||' IS NOT NULL' )
                   or 
               ( x.constraint_type = 'P'  and cc.column_name = a.column_name )
             )
            ) existing_constraint
       from 
         all_ind_columns a,
         all_tables c,
         all_tab_columns ccol
       where a.index_owner = p_schema
       and a.index_owner = p_schema
       and a.table_name = nvl(upper(p_table_name),a.table_name)
       and c.table_name = a.table_name
       and c.owner      = a.table_owner
       and c.owner      = ccol.owner
       and c.table_name = ccol.table_name
       and a.column_name = ccol.column_name
       and c.secondary = 'N'
       and c.temporary = 'N'
       and c.nested    = 'NO'
       and c.external  = 'NO'
       and ccol.data_type_owner is null
       and ccol.data_type not in ('LONG','LONG RAW','CLOB','UROWID','UNDEFINED','NCLOB','BLOB','BFILE','ROWID')
       and (c.owner,c.table_name) not in ( select owner, queue_table from all_queue_tables where owner = p_schema )
       )
       group by 
         table_name,
         column_name
       having min(existing_constraint) = 0
     );

  str0 varchar2(32767); 
  str1 varchar2(32767); 
  str2 varchar2(32767); 
  str3 varchar2(32767);

  prev varchar2(100) := '*';
  cnt number;
  trailer varchar2(5);

procedure do_sql(thesql varchar2) is
  tcursor integer;
  dummy integer;
begin
  -- dbms_output.put_line(thesql);
  execute immediate thesql;
end;

begin
for i in x loop
  if prev != i.table_name then
    str0 := 'declare ';
    str1 := 'begin select '; str2 := ' into ';
    str3 := ' '; cnt := 1;
  end if;
  if cnt = i.colcount then 
    trailer := ' ';
  else
    trailer := ','||chr(10);
  end if;
  str0 := str0 || 'v'||ltrim(cnt)||' number;';
  str1 := str1 || 'sum(decode('||i.column_name||',null,1,0))'||trailer;
  str2 := str2 || 'v'||ltrim(cnt)||trailer;
  str3 := str3 || 'if v'||ltrim(cnt)||' = 0 then '||
    'dbms_output.put_line(''alter table '||p_schema||'.'||i.table_name||
    ' modify ('||i.column_name||' not null);''); end if;'||chr(10);
  if cnt = i.colcount then
    str2 := str2 ||' from '||p_schema||'.'||i.table_name||';';
    str3 := str3 ||' end;';
    do_sql(str0||' '||str1||' '||str2||' '||str3);
  end if;
  prev := i.table_name;
  cnt := cnt + 1;
end loop;
end;
/
sho err

set serverout on
exec check_indexed_columns_for_null('SCOTT')

11.2 version

(Needs a direct CREATE TABLE grant to the owner of the procedure)


create or replace
procedure check_indexed_columns_for_null(
                 p_schema varchar2,
                 p_table_name varchar2 default null) is
  cursor c_template is
    select table_name, column_name, 0 colcount
    from   all_tab_columns;

  type rowlist is table of c_template%rowtype;  
  r    rowlist;
  rc   sys_refcursor;

  str0 varchar2(32767);
  str1 varchar2(32767);
  str2 varchar2(32767);
  str3 varchar2(32767);

  prev varchar2(100) := '*';
  cnt number;
  trailer varchar2(5);

procedure do_sql(thesql varchar2) is
  tcursor integer;
  dummy integer;
begin
  dbms_output.put_line(thesql);
  execute immediate thesql;
end;

begin

  begin
    execute immediate 'drop table tmp$cons purge';
  exception when others then null;
  end;

  execute immediate 'create table tmp$cons as  select owner, table_name, constraint_name, constraint_type, to_lob(search_condition) search_condition_vc '||
                    'from all_constraints';

  open rc for
    q'{select
      table_name,
      column_name,
      count(*) over ( partition by table_name ) as colcount
      from
      (
        select
          table_name,
          column_name,
          min(existing_constraint)
        from
        (
        select
            a.table_name,
            a.column_name,
            ( select  count(*)
              from    ( select owner, table_name, constraint_name, constraint_type, cast(search_condition_vc as varchar2(4000)) search_condition_vc 
                        from tmp$cons ) x,
                      all_cons_columns cc
              where   x.owner = c.owner
              and     x.table_name = c.table_name
              and     cc.owner      = x.owner
              and     cc.constraint_name = x.constraint_name
              and
             (
                ( x.constraint_type = 'C'  and replace(search_condition_vc,'"') = a.column_name||' IS NOT NULL' )
                    or
                ( x.constraint_type = 'P'  and cc.column_name = a.column_name )
              )
             ) existing_constraint
        from
          all_ind_columns a,
          all_tables c,
          all_tab_columns ccol
        where a.index_owner = :p_schema
        and a.index_owner = :p_schema
        and a.table_name = nvl(upper(:p_table_name),a.table_name)
        and c.table_name = a.table_name
        and c.owner      = a.table_owner
        and c.owner      = ccol.owner
        and c.table_name = ccol.table_name
        and a.column_name = ccol.column_name
        and c.secondary = 'N'
        and c.temporary = 'N'
        and c.nested    = 'NO'
        and (c.owner,c.table_name) not in ( select owner, table_name from all_external_tables where owner = :p_schema )
        and ccol.data_type_owner is null
        and ccol.data_type not in ('LONG','LONG RAW','CLOB','UROWID','UNDEFINED','NCLOB','BLOB','BFILE','ROWID')
        and (c.owner,c.table_name) not in ( select owner, queue_table from all_queue_tables where owner = :p_schema )
        )
        group by
          table_name,
          column_name
        having min(existing_constraint) = 0
      )
      }' using p_schema,p_schema,p_table_name,p_schema,p_schema;

  fetch rc bulk collect into r;
  close rc;
  
  for i in 1 .. r.count loop
    if prev != r(i).table_name then
      str0 := 'declare ';
      str1 := 'begin select '; str2 := ' into ';
      str3 := ' '; cnt := 1;
    end if;
    if cnt = r(i).colcount then
      trailer := ' ';
    else
      trailer := ','||chr(10);
    end if;
    str0 := str0 || 'v'||ltrim(cnt)||' number;';
    str1 := str1 || 'sum(decode('||r(i).column_name||',null,1,0))'||trailer;
    str2 := str2 || 'v'||ltrim(cnt)||trailer;
    str3 := str3 || 'if v'||ltrim(cnt)||' = 0 then '||
      'dbms_output.put_line(''alter table '||p_schema||'.'||r(i).table_name||
      ' modify ('||r(i).column_name||' not null);''); end if;'||chr(10);
    if cnt = r(i).colcount then
      str2 := str2 ||' from '||p_schema||'.'||r(i).table_name||';';
      str3 := str3 ||' end;';
      do_sql(str0||' '||str1||' '||str2||' '||str3);
    end if;
    prev := r(i).table_name;
    cnt := cnt + 1;
  end loop;
end;
/

 

If you want to see the underlying table scan queries that are being run, simply comment back in the “dbms_output.put_line” in the DO_SQL subroutine.

Enjoy!

Dealing with IP addresses

Kris Rice made mention on Twitter about the free (but limited lifetime) IP address to Country mappings that you can get from Maxmind.

image

If you ever need to deal with IP addresses, you might find the following utility package I wrote a while back useful.  It lets you convert from the string representation of an IP address to its numeric equivalent, and vice-versa.  It handles IPv4 and IPv6 with one caveat being that that I didn’t bother with the collapsed zeros for IPv6 so I could keep the performance snappy.  Free for your use without warranty or responsibility Smile


SQL> create or replace
  2  package ip_util is
  3
  4  function ip_num_from_str(p_ip_str varchar2) return number deterministic;
  5  function ip_str_from_num(p_ipnum number) return varchar2 deterministic;
  6
  7  end;
  8  /

Package created.

SQL>
SQL>
SQL> create or replace
  2  package body ip_util is
  3
  4    --
  5    -- constants need to be fixed, not expressions if you want to avoid ora-4068
  6    --
  7    l_ip41 constant number(12)  := 256;        -- power(256,1);
  8    l_ip42 constant number(12)  := 65536;      -- power(256,2);
  9    l_ip43 constant number(12)  := 16777216;   -- power(256,3);
 10    l_ip44 constant number(12)  := 4294967296; -- power(256,4);
 11
 12    l_ip61 constant number(38)  := 65536;                              --power(65536,1);
 13    l_ip62 constant number(38)  := 4294967296;                         --power(65536,2);
 14    l_ip63 constant number(38)  := 281474976710656;                    --power(65536,3);
 15    l_ip64 constant number(38)  := 18446744073709551616;               --power(65536,4);
 16    l_ip65 constant number(38)  := 1208925819614629174706176;          --power(65536,5);
 17    l_ip66 constant number(38)  := 79228162514264337593543950336;      --power(65536,6);
 18    l_ip67 constant number(38)  := 5192296858534827628530496329220096; --power(65536,7);
 19
 20
 21  function ip_num_from_str(p_ip_str varchar2) return number deterministic is
 22    l_ip_num     number;
 23    l_dot1       pls_integer;
 24    l_dot2       pls_integer;
 25    l_dot3       pls_integer;
 26    l_dot4       pls_integer;
 27
 28    l_colon      pls_integer;
 29    l_colon_cnt  pls_integer;
 30    l_hex        varchar2(32);
 31    l_ip_str     varchar2(64);
 32  begin
 33    if p_ip_str like '%.%' then
 34      l_dot1   := instr(p_ip_str,'.');
 35      l_dot2   := instr(p_ip_str,'.',l_dot1+1);
 36      l_dot3   := instr(p_ip_str,'.',l_dot2+1);
 37      l_dot4   := instr(p_ip_str,'.',l_dot3+1);
 38      if l_dot4 > 0 then
 39         raise_application_error(-20000,'Cannot be resolved to an IP4 address');
 40      end if;
 41
 42      l_ip_num :=  l_ip43*to_number(substr(p_ip_str,1,l_dot1-1)) +
 43                   l_ip42*to_number(substr(p_ip_str,l_dot1+1,l_dot2-l_dot1-1)) +
 44                   l_ip41*to_number(substr(p_ip_str,l_dot2+1,l_dot3-l_dot2-1)) +
 45                   to_number(substr(p_ip_str,l_dot3+1));
 46
 47    elsif p_ip_str like '%:%' then
 48      --
 49      -- Note: The abbreviation of "Consecutive sections of zeroes are replaced with a double colon (::)" is not implemented.
 50      --
 51      l_colon_cnt := length(p_ip_str)-length(replace(p_ip_str,':'));
 52      if l_colon_cnt != 7 then
 53         raise_application_error(-20000,'Cannot be resolved to an IP6 address');
 54      end if;
 55
 56      l_ip_str := p_ip_str||':';
 57      loop
 58        l_colon := instr(l_ip_str,':');
 59        l_hex := l_hex || lpad(substr(l_ip_str,1,l_colon-1),4,'0');
 60        l_ip_str := substr(l_ip_str,l_colon+1);
 61        exit when l_ip_str is null;
 62      end loop;
 63      l_ip_num := to_number(l_hex,'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx');
 64    end if;
 65
 66    return l_ip_num;
 67  end;
 68
 69
 70  function ip_str_from_num(p_ipnum number) return varchar2 deterministic is
 71  begin
 72    if p_ipnum < l_ip44 then
 73      return  mod(trunc(p_ipnum/l_ip43),l_ip41) ||'.'||
 74              mod(trunc(p_ipnum/l_ip42),l_ip41) ||'.'||
 75              mod(trunc(p_ipnum/l_ip41),l_ip41) ||'.'||
 76              mod(p_ipnum,l_ip41);
 77    else
 78      --
 79      -- Note: The abbreviation of "Consecutive sections of zeroes are replaced with a double colon (::)" is not implemented.
 80      --
 81      return  to_char(mod(trunc(p_ipnum/l_ip67),l_ip61),'fmxxxx') ||':'||
 82              to_char(mod(trunc(p_ipnum/l_ip66),l_ip61),'fmxxxx') ||':'||
 83              to_char(mod(trunc(p_ipnum/l_ip65),l_ip61),'fmxxxx') ||':'||
 84              to_char(mod(trunc(p_ipnum/l_ip64),l_ip61),'fmxxxx') ||':'||
 85              to_char(mod(trunc(p_ipnum/l_ip63),l_ip61),'fmxxxx') ||':'||
 86              to_char(mod(trunc(p_ipnum/l_ip62),l_ip61),'fmxxxx') ||':'||
 87              to_char(mod(trunc(p_ipnum/l_ip61),l_ip61),'fmxxxx') ||':'||
 88              to_char(mod(p_ipnum,l_ip61),'fmxxxx');
 89    end if;
 90  end;
 91
 92  end;
 93  /

Package body created.

SQL> select ip_util.ip_num_from_str('192.168.1.2') from dual;

IP_UTIL.IP_NUM_FROM_STR('192.168.1.2')
--------------------------------------
                            3232235778

SQL> select ip_util.ip_str_from_num(3232235778) from dual;

IP_UTIL.IP_STR_FROM_NUM(3232235778)
-----------------------------------------------------------------------------------------------------------------
192.168.1.2

SQL> select ip_util.ip_num_from_str('2001:db8:0:0:0:ff00:42:8329') ip from dual;

                                          IP
--------------------------------------------
      42540766411282592856904265327123268393

SQL> select ip_util.ip_str_from_num(42540766411282592856904265327123268393) from dual;

IP_UTIL.IP_STR_FROM_NUM(42540766411282592856904265327123268393)
-----------------------------------------------------------------------------------------------------------------
2001:db8:0:0:0:ff00:42:8329


“Oh…another language is too hard”

We had a request on AskTOM a few days ago asking for an implementation of the XIRR function in PL/SQL.

I didn’t really know much about what that function was, or what it did, but a quick web search yielded plenty of examples in Excel, where it is a pre-delivered function, as described here:

“Returns the internal rate of return for a schedule of cash flows that is not necessarily periodic.”

That explains what the function does, but doesn’t explain how the function result is derived.  So I pushed back to the author of the question saying that if they could give us an algorithm for the function, then we could take a look at implementing the algorithm in PL/SQL.  After some research they came back with the following diagram taken from: http://maestro.ipcc.ca/files/rate_of_return/XIRR_ROR.pdf

image

The formula with some cool mathematical symbols on the page looks like an algorithm but it is not – the real nuts and bolts is in bullet point “3”.  There is no “formula” as such for XIRR – we need an iterative approach, namely, try an initial set of conditions that yields an answer “close” to the desired result, but if not, iterate continuously using previous results to slowly “narrow in” on the result until we meet some nominated tolerance value.  This was starting to sound complicated Smile

Luckily, our author also posted a link to a solution that has been written in C##.  So I posed myself the question – “How hard is it to port from C##…” (a language I have never used professionally as a developer) “… run in the database as PL/SQL?”.  After all, often we hear that database-centric code cannot be tackled in an organization because the task of understanding PL/SQL as a non-PL/SQL developer is simply “too onerous”.

Here is the original C## code.


using System;
using System.Collections.Generic;
using System.Linq;


namespace Xirr
{
    public class Program
    {
        private const Double DaysPerYear = 365.0;
        private const int MaxIterations = 100;
        private const double DefaultTolerance = 1E-6;
        private const double DefaultGuess = 0.1;

    private static readonly Func, Double> NewthonsMethod =
        cf => NewtonsMethodImplementation(cf, Xnpv, XnpvPrime);

    private static readonly Func, Double> BisectionMethod =
        cf => BisectionMethodImplementation(cf, Xnpv);

    public static void Main(string[] args)
    {
        RunScenario(new[]
            {
                // this scenario fails with Newton's but succeeds with slower Bisection
                new CashItem(new DateTime(2012, 6, 1), 0.01),
                new CashItem(new DateTime(2012, 7, 23), 3042626.18),
                new CashItem(new DateTime(2012, 11, 7), -491356.62),
                new CashItem(new DateTime(2012, 11, 30), 631579.92),
                new CashItem(new DateTime(2012, 12, 1), 19769.5),
                new CashItem(new DateTime(2013, 1, 16), 1551771.47),
                new CashItem(new DateTime(2013, 2, 8), -304595),
                new CashItem(new DateTime(2013, 3, 26), 3880609.64),
                new CashItem(new DateTime(2013, 3, 31), -4331949.61)
            });
        RunScenario(new[]
            {
                new CashItem(new DateTime(2001, 5, 1), 10000),
                new CashItem(new DateTime(2002, 3, 1), 2000),
                new CashItem(new DateTime(2002, 5, 1), -5500),
                new CashItem(new DateTime(2002, 9, 1), 3000),
                new CashItem(new DateTime(2003, 2, 1), 3500),
                new CashItem(new DateTime(2003, 5, 1), -15000)
            });
    }

    private static void RunScenario(IEnumerable cashFlow)
    {
        try
        {
            try
            {
                var result = CalcXirr(cashFlow, NewthonsMethod);
                Console.WriteLine("XIRR [Newton's] value is {0}", result);
            }
            catch (InvalidOperationException)
            {
                // Failed: try another algorithm
                var result = CalcXirr(cashFlow, BisectionMethod);
                Console.WriteLine("XIRR [Bisection] (Newton's failed) value is {0}", result);
            }
        }
        catch (ArgumentException e)
        {
            Console.WriteLine(e.Message);
        }
        catch (InvalidOperationException exception)
        {
            Console.WriteLine(exception.Message);
        }
    }

    private static double CalcXirr(IEnumerable cashFlow, Func, double> method)
    {
        if (cashFlow.Count(cf => cf.Amount > 0) == 0)
            throw new ArgumentException("Add at least one positive item");

        if (cashFlow.Count(c => c.Amount < 0) == 0)
            throw new ArgumentException("Add at least one negative item");

        var result = method(cashFlow);

        if (Double.IsInfinity(result))
            throw new InvalidOperationException("Could not calculate: Infinity");

        if (Double.IsNaN(result))
            throw new InvalidOperationException("Could not calculate: Not a number");

        return result;
    }

    private static Double NewtonsMethodImplementation(IEnumerable cashFlow,
                                                      Func, Double, Double> f,
                                                      Func, Double, Double> df,
                                                      Double guess = DefaultGuess,
                                                      Double tolerance = DefaultTolerance,
                                                      int maxIterations = MaxIterations)
    {
        var x0 = guess;
        var i = 0;
        Double error;
        do
        {
            var dfx0 = df(cashFlow, x0);
            if (Math.Abs(dfx0 - 0) < Double.Epsilon)
                throw new InvalidOperationException("Could not calculate: No solution found. df(x) = 0");

            var fx0 = f(cashFlow, x0);
            var x1 = x0 - fx0/dfx0;
            error = Math.Abs(x1 - x0);

            x0 = x1;
        } while (error > tolerance && ++i < maxIterations);
        if (i == maxIterations)
            throw new InvalidOperationException("Could not calculate: No solution found. Max iterations reached.");

        return x0;
    }

    internal static Double BisectionMethodImplementation(IEnumerable cashFlow,
                                                         Func, Double, Double> f,
                                                         Double tolerance = DefaultTolerance,
                                                         int maxIterations = MaxIterations)
    {
        // From "Applied Numerical Analysis" by Gerald
        var brackets = Brackets.Find(Xnpv, cashFlow);
        if (Math.Abs(brackets.First - brackets.Second) < Double.Epsilon)
            throw new ArgumentException("Could not calculate: bracket failed");

        Double f3;
        Double result;
        var x1 = brackets.First;
        var x2 = brackets.Second;

        var i = 0;
        do
        {
            var f1 = f(cashFlow, x1);
            var f2 = f(cashFlow, x2);

            if (Math.Abs(f1) < Double.Epsilon && Math.Abs(f2) < Double.Epsilon)
                throw new InvalidOperationException("Could not calculate: No solution found");

            if (f1*f2 > 0)
                throw new ArgumentException("Could not calculate: bracket failed for x1, x2");

            result = (x1 + x2)/2;
            f3 = f(cashFlow, result);

            if (f3*f1 < 0)
                x2 = result;
            else
                x1 = result;
        } while (Math.Abs(x1 - x2)/2 > tolerance && Math.Abs(f3) > Double.Epsilon && ++i < maxIterations);

        if (i == maxIterations)
            throw new InvalidOperationException("Could not calculate: No solution found");

        return result;
    }

    private static Double Xnpv(IEnumerable cashFlow, Double rate)
    {
        if (rate <= -1)
            rate = -1 + 1E-10; // Very funky ... Better check what an IRR <= -100% means

        var startDate = cashFlow.OrderBy(i => i.Date).First().Date;
        return
            (from item in cashFlow
             let days = -(item.Date - startDate).Days
             select item.Amount*Math.Pow(1 + rate, days/DaysPerYear)).Sum();
    }

    private static Double XnpvPrime(IEnumerable cashFlow, Double rate)
    {
        var startDate = cashFlow.OrderBy(i => i.Date).First().Date;
        return (from item in cashFlow
                let daysRatio = -(item.Date - startDate).Days/DaysPerYear
                select item.Amount*daysRatio*Math.Pow(1.0 + rate, daysRatio - 1)).Sum();
    }

    public struct Brackets
    {
        public readonly Double First;
        public readonly Double Second;

        public Brackets(Double first, Double second)
        {
            First = first;
            Second = second;
        }

        internal static Brackets Find(Func, Double, Double> f,
                                      IEnumerable cashFlow,
                                      Double guess = DefaultGuess,
                                      int maxIterations = MaxIterations)
        {
            const Double bracketStep = 0.5;
            var leftBracket = guess - bracketStep;
            var rightBracket = guess + bracketStep;
            var i = 0;
            while (f(cashFlow, leftBracket)*f(cashFlow, rightBracket) > 0 && i++ < maxIterations)
            {
                leftBracket -= bracketStep;
                rightBracket += bracketStep;
            }

            return i >= maxIterations
                       ? new Brackets(0, 0)
                       : new Brackets(leftBracket, rightBracket);
        }
    }

    public struct CashItem
    {
        public DateTime Date;
        public Double Amount;

        public CashItem(DateTime date, Double amount)
        {
            Date = date;
            Amount = amount;
        }
    }
}

And after …maybe…30 minutes, I got a PL/SQL version up and running.


SQL> set serverout on
SQL> declare
  2
  3    cDaysPerYear      number := 365.0;
  4    cMaxIterations    number := 100;
  5    cDefaultTolerance number := 1E-6;
  6    cDefaultGuess     number := 0.1;
  7    cDouble_Epsilon   number := 0.00000000001;
  8
  9    type cash_obj is record ( dte date, amount number );
 10    type cash_list is table of cash_obj index by pls_integer;
 11    cash_item cash_list;
 12
 13    type bracket_rec is record ( frst number, second number );
 14
 15    function Xnpv(cashFlow cash_list, prate number) return number is
 16      startdate date := cashFlow(1).dte;
 17      dys number;
 18      tot number := 0;
 19      rate number := prate;
 20    begin
 21        if (rate <= -1) then rate := -1 + 1E-10; end if;
 22
 23        for item in 1 .. cashFlow.count loop
 24             dys := -(cashFlow(item).Dte - startDate);
 25             tot := tot + cashFlow(item).Amount*Power(1 + rate, dys/cDaysPerYear);
 26        end loop;
 27        return tot;
 28    end;
 29
 30    function XnpvPrime(cashFlow cash_list, prate number) return number is
 31      startdate date := cashFlow(1).dte;
 32      daysRatio number;
 33      tot number := 0;
 34      rate number := prate;
 35    begin
 36        for item in 1 .. cashFlow.count loop
 37             daysRatio := -(cashFlow(item).Dte - startDate)/cDaysPerYear;
 38             tot := tot + cashFlow(item).Amount*daysRatio*Power(1 + rate, daysRatio - 1);
 39        end loop;
 40        return tot;
 41    end;
 42
 43    function BracketsFind(cashFlow cash_list,guess number := cDefaultGuess) return bracket_rec is
 44      x bracket_rec;
 45      bracketStep number := 0.5;
 46      leftBracket number := guess - bracketStep;
 47      rightBracket number := guess + bracketStep;
 48      i int := 0;
 49    begin
 50      while Xnpv(cashFlow, leftBracket)*Xnpv(cashFlow, rightBracket) > 0
 51      loop
 52          leftBracket := leftBracket - bracketStep;
 53          rightBracket:= rightBracket + bracketStep;
 54          i := i + 1;
 55      end loop;
 56
 57      if i >= cmaxIterations then
 58         x.frst := 0;
 59         x.second := 0;
 60         return x;
 61      else
 62         x.frst := leftBracket;
 63         x.second := rightBracket;
 64         return x;
 65      end if;
 66    end;
 67
 68    function BisectionMethodImplementation(cashFlow cash_list,tolerance number := cDefaultTolerance, maxIterations number := cMaxIterations) return number is
 69
 70       brackets bracket_rec;
 71       f3 number;
 72       result number;
 73       x1 number;
 74       x2 number;
 75       i int := 0;
 76       f1 number;
 77       f2 number;
 78    begin
 79       brackets := BracketsFind(cashFlow);
 80       if Abs(brackets.Frst - brackets.Second) < cDouble_Epsilon then
 81          raise_application_error(-20000,'Could not calculate: bracket failed');
 82       end if;
 83       x1 := brackets.Frst;
 84       x2 := brackets.Second;
 85
 86       loop
 87
 88          f1 := Xnpv(cashFlow, x1);
 89          f2 := Xnpv(cashFlow, x2);
 90
 91          if Abs(f1) < cDouble_Epsilon and Abs(f2) < cDouble_Epsilon then
 92              raise_application_error(-20000,'Could not calculate: No solution found');
 93          end if;
 94
 95          if (f1*f2 > 0) then
 96              raise_application_error(-20000,'Could not calculate: bracket failed for x1, x2');
 97          end if;
 98
 99          result := (x1 + x2)/2;
100          f3 := Xnpv(cashFlow, result);
101
102          if (f3*f1 < 0) then
103              x2 := result;
104          else
105              x1 := result;
106          end if;
107          i := i + 1;
108          exit when not
109            (Abs(x1 - x2)/2 > tolerance and Abs(f3) > cDouble_Epsilon and i < cmaxIterations);
110       end loop;
111
112       if (i = cmaxIterations) then
113          raise_application_error(-20000,'Could not calculate: No solution found');
114       end if;
115
116       return result;
117    end;
118
119    function NewtonsMethodImplementation(cashFlow cash_list,tolerance number := cDefaultTolerance, maxIterations number := cMaxIterations,guess number := cDefaultGuess) return number is
120        x0 number := guess;
121        i int := 0;
122        error number;
123        fx0 number;
124        x1 number;
125        dfx0 number;
126    begin
127        loop
128            dfx0 := XnpvPrime(cashFlow, x0);
129            if (Abs(dfx0 - 0) < cDouble_Epsilon) then
130                raise_application_error(-20000,'Could not calculate: No solution found. df(x) = 0');
131            end if;
132
133            fx0 := Xnpv(cashFlow, x0);
134            x1 := x0 - fx0/dfx0;
135            error := Abs(x1 - x0);
136
137            x0 := x1;
138          i := i + 1;
139          exit when not (error > tolerance and  i < cmaxIterations );
140        end loop;
141
142        if (i = maxIterations) then
143            raise_application_error(-20000,'Could not calculate: No solution found. Max iterations reached.');
144        end if;
145        return x0;
146      end;
147
148      function CalcXirr(cashFlow cash_list) return number is
149        ok boolean := false;
150      begin
151        for i in 1 .. cashFlow.count loop
152           ok := ok or cashFlow(i).amount > 0;
153        end loop;
154        if not ok then raise_application_error(-20000,'Add at least one positive item');   end if;
155
156        ok := false;
157        for i in 1 .. cashFlow.count loop
158           ok := ok or cashFlow(i).amount < 0;
159        end loop;
160        if not ok then raise_application_error(-20000,'Add at least one negative item');   end if;
161
162  --      return BisectionMethodImplementation(cashFlow);
163        return NewtonsMethodImplementation(cashFlow);
164
165      end;
166
167  begin
168
169    cash_item(1).dte := date '2008-01-01';
170    cash_item(2).dte := date '2008-03-01';
171    cash_item(3).dte := date '2008-10-30';
172    cash_item(4).dte := date '2009-02-15';
173    cash_item(5).dte := date '2009-04-01';
174
175    cash_item(1).Amount := -10000;
176    cash_item(2).Amount := 2750;
177    cash_item(3).Amount := 4250;
178    cash_item(4).Amount := 3250;
179    cash_item(5).Amount := 2750;
180
181    dbms_output.put_line(CalcXirr(cash_item));
182  end;
183  /
.3733625335188315103083924482521883278399

PL/SQL procedure successfully completed.

I validated the results against various data examples I found on the web.  The example above maps to the example in the Excel link previous mentioned.

 

image

Let me assure you, I am in no way claiming that this was some sort of incredible feat on my part.  It was simply taking each function in turn, making some assumptions about the variables and the coding constructs and iteratively converting the code over, and then testing it for validity within an anonymous block.  And similarly, the PL/SQL code is not a shining example of quality coding standards – quite the opposite.  It deliberately is trying to stay close to the naming standards and conventions that were in the C## source.  Besides a couple of web searches (for example, to examine what Double.Epilson represented) the process was relatively straightforward.

You may be rolling your eyes and thinking to yourself: “Here we go….he’s going to tell us that C# is junk and that we should refactor our C# applications into the database”, but that is definitely not the case.  My point is – if you have some data-intensive code in the application tier that will get benefits from locating that code closer to the database, you might surprise yourself by discovering how easy it is to port the logic from one coding language to another, in this case, PL/SQL.  And for your Oracle databases, there is no simply better data-centric language than PL/SQL.

Instrumentation … not just for debugging

Yeah I know.  You’re probably thinking “Here’s another blog post from someone telling us how important it is to instrument our code, so we can get better debugging, better performance, blah blah blah”.

If that is your sentiment, then I’m thrilled because it means the instrumentation message has got through to the extent that it is now considered just a statement of the obvious.  As many will know, my predecessor was a huge advocate of extensive instrumentation in your application code.  And there is no shortage of examples out there of other Oracle professionals who similarly wax lyrical on the incredible importance of well instrumented code.  For the record, I’m in that camp as well, and we now even have contributions to the Oracle community for everyone to have a robust instrumentation framework in their code.  So there is plenty of information out there, and it is pretty safe to say that there are no more excuses left for un-instrumented or poorly instrumented code.

So why the blog post ?  Well…I’ve got a slightly different assertion to make when it comes to instrumentation, namely, instrumentation can help your user interface.  You will be able to provide a better experience to the users of your application by having your code instrumented.  Now you might be thinking that I’m just banging the same drum here, that is, by instrumenting I’ll be fixing bugs faster, or picking up performance issues more easily, or collecting metrics to help improve application quality.  These are of course true benefits of instrumentation, but we all know that right ?

No…my motivation for this blog post stems from an activity on one of my laptop’s last week.

I went to upgrade it to Windows 10.  I fired off the process and after a short while, the following window appeared

image

and the familiar “something is happening” feedback swirling dots…

image

I have several laptops at home, so there was no need to sit and watch this one, so I left it and worked on another tackling some AskTom questions.  The laptop I upgrading just sits on the shelf, and occasionally plays Netflix for my children.  Hence, naturally enough… I totally forgot about it!

That was Saturday.  Now on Monday… 48 hours later, I’ve just been over to and waved the mouse to restore the display.  And this is what is on the screen:

image

My oh my …. there must be a lot of things to get ready Smile

The obvious question now is – has anything actually happened ?  Has the program hung ?  Or is it downloading a huge Windows 10 OS bundle ?  As a consumer of this “application” (the upgrade process) I have no idea.  There is nothing that tells me what state the application is in.  Now let me give the Microsoft developers the benefit of the doubt here, and make the assumption that the upgrade is extensively instrumented, and that if I looked in a logging file somewhere on the laptop hard drive, there would be a capture of the current state which would tell me whether any true progress is being made.  And that is typically how most us of implement instrumentation – a means to capture information for diagnosis.

But to do so is to miss an opportunity to directly assist the user.  Even if the content of the instrumentation would appear to be gibberish to a typical Windows consumer, just the presence of that detail on the user interface  becomes an improvement to the user experience.  If my upgrade dialog box had looked something like this:

image

then even if I have no idea what “getting platform details” means, the fact that I was on Step 17 at 9:00:12 helps me as a consumer, because I have immediate feedback as to whether progress is being made or not.  If I see that message at 9:05am, then I’m likely to let it sit a little longer to see if a “Step 18” ever eventuates.  But if I’m seeing that message at 5pm, then I can be reasonably sure that all is not well.

And no, I don’t think a swirling icon is sufficient.  We all used to criticize the “Progress Bars” on older versions of Windows and the like, when it would either exceed 100% or never get there, or jump back and forth.  So I can understand the motivation for the swirly icons we see in all modern devices and operating systems, because they avoid the task of giving any true prediction of completion.  And don’t get me wrong, I’m not saying that a return to progress bars is the solution, because that would be understating the difficultly of being able to know how long a task will take.  But simply taking advantage of the instrumentation code that I know you already have Smile, and presenting that in some way to the consumers of your application can be a better user experience for them.

Keep that in mind.

Haversine PL/SQL

I didn’t see a PL/SQL version on https://rosettacode.org for the Haversine formula so here’s a version for anyone that needs it.



SQL> create or replace
  2  function p2p_distance(
  3              p_latitude1 number,
  4              p_longitude1 number,
  5              p_latitude2 number,
  6              p_longitude2 number) return number deterministic is
  7    earth_radius  number := 6371;
  8    pi_approx     number := 3.1415927/180; 
  9    lat_delta     number := (p_latitude2-p_latitude1)*pi_approx;
 10    lon_delta     number := (p_longitude2-p_longitude1)*pi_approx;
 11    arc           number := sin(lat_delta/2) * sin(lat_delta/2) +
 12                                 sin(lon_delta/2) * sin(lon_delta/2) * cos(p_latitude1*pi_approx) * cos(p_latitude2*pi_approx);
 13  begin
 14    return earth_radius * 2 * atan2(sqrt(arc), sqrt(1-arc));
 15  end;
 16  /

Function created.

SQL>
SQL> select p2p_distance(36.12, -86.67, 33.94, -118.4) from dual;

P2P_DISTANCE(36.12,-86.67,33.94,-118.4)
---------------------------------------
                             2886.40705

Simple demo of message propagation

If you’re using AQ, then it’s simple to setup simple enqueue and dequeue facilities on your local database to provide all sorts of asynchronous style processing in your applications.  As long as you’re applications are designed and built to handle it, the “fire and forget” model to keep user applications responsive, and all of the “heavy lifting” done in the background is a very attractive one.

You can also use AQ to achieve the same concept across multiple databases, and the database will take care of propagating the messages from one database to the other.  Here’s a simple demo of that.

Database 1




SQL> connect / as sysdba
Connected.

--
-- A user to hold all of our AQ stuff
--
SQL> create user aqtest identified by aqtest;

User created.

SQL> grant connect, resource, aq_administrator_role to aqtest;

Grant succeeded.

SQL> grant execute on dbms_aq to aqtest;

Grant succeeded.

SQL> grant execute on dbms_aqadm to aqtest;

Grant succeeded.

SQL> alter user aqtest quota unlimited on users;

User altered.

SQL> grant create database link to aqtest;

Grant succeeded.

SQL> begin
  2    dbms_aqadm.grant_system_privilege('ENQUEUE_ANY','AQTEST',FALSE);
  3    dbms_aqadm.grant_system_privilege('DEQUEUE_ANY','AQTEST',FALSE);
  4  end;
  5  /

PL/SQL procedure successfully completed.

--
-- Now we connect as AQTEST and construct our message payload types and our local queues
--

SQL> connect aqtest/aqtest
Connected.

SQL> create type aqtest.message_typ as object(subject varchar2(30), text varchar2(80));
  2  /

Type created.

SQL> begin
  2    dbms_aqadm.create_queue_table(queue_table => 'aqtest.messages_qtab',
  3                                  queue_payload_type =>  'aqtest.Message_typ',
  4                                  multiple_consumers => TRUE);
  5    dbms_aqadm.create_queue(queue_name => 'MSG_QUEUE',
  6                            queue_table => 'aqtest.messages_qtab');
  7    dbms_aqadm.start_queue(queue_name => 'MSG_QUEUE');
  8  end;
  9  /

PL/SQL procedure successfully completed.

--
-- And here is a basic enqueue routine.  If a remote address is specify, then we will propagate
-- the message to that address.  Otherwise the message will stay in the local queue.
--
SQL> create or replace procedure enqueue_msg(p_msg in varchar2,
  2                                          p_remote_address in varchar2 default null)
  3  as
  4    l_enqueue_options    dbms_aq.enqueue_options_t;
  5    l_message_properties dbms_aq.message_properties_t;
  6    l_message_handle     raw(16);
  7    l_message            aqtest.message_typ;
  8    l_recipients         dbms_aq.aq$_recipient_list_t;
  9  BEGIN
 10    l_recipients(1) := SYS.aq$_agent('RECIPIENT', p_remote_address, null);
 11    l_message_properties.recipient_list := l_recipients;
 12
 13    l_message := message_typ('NORMAL MESSAGE',  p_msg );
 14    dbms_aq.enqueue(queue_name => 'msg_queue',
 15                    enqueue_options => l_enqueue_options,
 16                    message_properties => l_message_properties,
 17                    payload => l_message,
 18                    msgid => l_message_handle);
 19  end;
 20  /

Procedure created.

Database 2, we do the exact same setup



SQL> connect / as sysdba
Connected.

--
-- A user to hold all of our AQ stuff
--
SQL> create user aqtest identified by aqtest;

User created.

SQL> grant connect, resource, aq_administrator_role to aqtest;

Grant succeeded.

SQL> grant execute on dbms_aq to aqtest;

Grant succeeded.

SQL> grant execute on dbms_aqadm to aqtest;

Grant succeeded.

SQL> alter user aqtest quota unlimited on users;

User altered.

SQL> grant create database link to aqtest;

Grant succeeded.

SQL> begin
  2    dbms_aqadm.grant_system_privilege('ENQUEUE_ANY','AQTEST',FALSE);
  3    dbms_aqadm.grant_system_privilege('DEQUEUE_ANY','AQTEST',FALSE);
  4  end;
  5  /

PL/SQL procedure successfully completed.

--
-- Now we connect as AQTEST and construct our message payload types and our local queues
--

SQL> connect aqtest/aqtest
Connected.

SQL> create type aqtest.message_typ as object(subject varchar2(30), text varchar2(80));
  2  /

Type created.

SQL> begin
  2    dbms_aqadm.create_queue_table(queue_table => 'aqtest.messages_qtab',
  3                                  queue_payload_type =>  'aqtest.Message_typ',
  4                                  multiple_consumers => TRUE);
  5    dbms_aqadm.create_queue(queue_name => 'MSG_QUEUE',
  6                            queue_table => 'aqtest.messages_qtab');
  7    dbms_aqadm.start_queue(queue_name => 'MSG_QUEUE');
  8  end;
  9  /

PL/SQL procedure successfully completed.

Back to Database 1


--
-- We need a database link to the remote database, plus a quick query to test that its working
--

SQL> create database link remote_db connect to aqtest identified by aqtest using 'db11';

Database link created.

SQL> select * from tab@remote_db;

no rows selected


--
-- We get our propagation schedule running, and we're ready to go.   
--

SQL> begin
  2    dbms_aqadm.schedule_propagation(queue_name  => 'MSG_QUEUE',
  3                                        destination => 'remote_db',
  4                                        start_time  => sysdate,
  5                                        latency     => 0);
  6  end;
  7  /

PL/SQL procedure successfully completed.

SQL> select * from user_queue_schedules
  2  @pr
==============================
QNAME                         : MSG_QUEUE
DESTINATION                   : REMOTE_DB
START_DATE                    :
START_TIME                    : 14:16:01
PROPAGATION_WINDOW            :
NEXT_TIME                     :
LATENCY                       : 0
SCHEDULE_DISABLED             : N
PROCESS_NAME                  : J000
SESSION_ID                    : 400, 38936
INSTANCE                      : 1
LAST_RUN_DATE                 : 28-NOV-16 02.16.01.283000 PM +08:00
LAST_RUN_TIME                 : 14:16:01
CURRENT_START_DATE            : 28-NOV-16 02.16.01.283000 PM +08:00
CURRENT_START_TIME            : 14:16:01
NEXT_RUN_DATE                 : 28-NOV-16 02.16.01.280000 PM +08:00
NEXT_RUN_TIME                 : 14:16:01
TOTAL_TIME                    : 0
TOTAL_NUMBER                  : 0
TOTAL_BYTES                   : 0
MAX_NUMBER                    : 0
MAX_BYTES                     : 0
AVG_NUMBER                    : 0
AVG_SIZE                      : 0
AVG_TIME                      : 0
FAILURES                      : 0
LAST_ERROR_DATE               :
LAST_ERROR_TIME               :
LAST_ERROR_MSG                :
MESSAGE_DELIVERY_MODE         : PERSISTENT
ELAPSED_DEQUEUE_TIME          :
ELAPSED_PICKLE_TIME           :
JOB_NAME                      : AQ_JOB$_6438

PL/SQL procedure successfully completed.

--
-- a message that will not be propagaged, because remote recipient is not specified
--
SQL> begin
  2    enqueue_msg('This message will stay local');
  3    commit;
  4  end;
  5  /

PL/SQL procedure successfully completed.

--
-- a message that WILL be propagated, because remote recipient is specified
--
SQL> begin
  2    enqueue_msg('This message will be propagated.',
  3                 'aqtest.msg_queue_other@remote_db');
  4    commit;
  5  end;
  6  /

PL/SQL procedure successfully completed.

--
-- So if everything is working correctly, we have 2 messages on the local queue, and 1 message on the remote queue
--

SQL>
SQL> select t1.cnt,
  2         t2.cnt
  3  from (select count(*) cnt from messages_qtab) t1,
  4       (select count(*) cnt from messages_qtab_other@remote_db) t2
  5  /

       CNT        CNT
---------- ----------
         2          1

1 row selected.

SQL>
SQL>

And there you go.  Messages between databases using the in-built propagation mechanisms.

BULK COLLECT into nested table

I had an observation come to me last week about PL/SQL and populating nested tables.

“The BULK COLLECT into statement cannot be used repeatedly to append results into a table.
Instead, it silently truncates the target table each time. “

This is true.  However, if you need to use BULK COLLECT to append results into a single nested table data structure, it is trivial to come up with a workaround using the MULTISET syntax.


SQL> drop table t1 purge;

Table dropped.

SQL> create table t1 as select * from dba_objects;

Table created.

SQL>
SQL> drop table t2 purge;

Table dropped.

SQL> create table t2 as select * from dba_objects;

Table created.

SQL>
SQL> set serverout on
SQL> declare
  2    type ntt is table of t1%rowtype;
  3    r1 ntt := ntt();
  4    r2 ntt := ntt();
  5  begin
  6    select * bulk collect into r1 from t1 ;
  7    select * bulk collect into r2 from t2 ;
  8    dbms_output.put_line('T1 count= '||r1.count);
  9    dbms_output.put_line('T2 count= '||r2.count);
 10
 11    r1 := r1 MULTISET UNION ALL r2;
 12
 13    dbms_output.put_line('TOTAL = '||r1.count);
 14  end;
 15  /
T1 count= 99250
T2 count= 99250
TOTAL = 198500

PL/SQL procedure successfully completed.

Read more about the multiset syntax here