Blazingly fast documentation search with APEX

Posted by

The Oracle documentation set is my bread and butter for day to day work. With AskTom questions, StackOverflow question, Einstein (our internal Q&A site), whenever I’m trying to help someone, the first thing I’ll typically do is see if there is some existing information in the documentation set that might be of use. Doing is this generally a “win-win” because threes outcomes are possible:

  • The information is found with a trivial search. Result: Simply point the person to the link.
  • The information is there but hard to find. Result: Log feedback with the docs team on potential better categorisation.
  • The information is not there. Result: Log feedback with the docs team about missing content.

All three ultimately will yield a better experience over time for all customers.

However, I live in Perth a place from which all internet traffic must generally travel up hill Smile

image

so I keep a local copy of the Oracle documentation on my PC to make navigating through it very snappy. The issue with a local copy of the documentation is that you no longer have the Search facility available.

Those in the “older demographic” may remember pre-internet versions of the Oracle documentation used to ship with a small Java program to do local search, but Java programs running in the browser are a thing of the past.

But one of the (many) cool things with Oracle Text is that you can index a set of files, so I decide to build my own search index on top of the local documentation copy.

First we need a complete list of the HTML files.


X:\oracle\doc\21>dir /s /b *.html > x:\temp\filelist.dat

Now I’ll allow the database to see that list but putting a simple external table over the top of my filelist.dat file.


SQL> create or replace directory temp as 'x:\temp';

Directory created.

SQL> create table filelist (
  2    fname varchar2(512),
  3    dirpath varchar2(512) generated always as ( cast(rtrim(substr(fname,1,instr(fname,'\',-1)),'\') as varchar2(512)))
  4  )
  5  organization external
  6  (
  7    type oracle_loader
  8    default directory temp
  9    access parameters
 10    (
 11      records delimited by newline
 12      nobadfile
 13      nologfile
 14      fields ldrtrim
 15      missing field values are null
 16      reject rows with all null fields
 17      (
 18        fname (1:512)  char(512)
 19      )
 20    )
 21    location ('filelist.dat')
 22  );

Table created.

To be be able to reference each of those files I’m going to have an Oracle directory object that maps to each distinct subdirectory in my documentation tree. I could just name those directories with a simple sequential name like ‘DIR1,DIR2,DIR3’ etc but my preference is something that reveals the true destination, so I’ll create a small function that will give a name that is “close” to the true folder I’m referencing.


SQL> create or replace
  2  function path_to_dirname(p_path varchar2) return varchar2 is
  3    l_prefix_path varchar2(100) := 'X:\oracle\doc\21';
  4    l_prefix_dir  varchar2(100) := 'XDOC21';
  5    l_prefix_path_len int := length(l_prefix_path);
  6    l_prefix_dir_len  int := length(l_prefix_dir);
  7    l_max_len     int := 30;
  8    l_dir_name    varchar2(128);
  9    l_delim       varchar2(1)   := '\';   -- change for Windows/Unix
 10  begin
 11    if p_path not like l_prefix_path||'%' then
 12      raise_application_error(-20000,'Not rooted at '||l_prefix_path);
 13    end if;
 14
 15    if p_path = l_prefix_path then
 16      l_dir_name := l_prefix_dir;
 17    else
 18      l_dir_name := regexp_replace(replace(replace(substr(p_path,l_prefix_path_len+1),l_delim,'_'),'-','_'),'[^[:alnum:] _]');
 19      l_dir_name := l_prefix_dir||'_'||regexp_replace(trim('_' from upper(regexp_replace(l_dir_name,'  *','_'))),'__*','_');
 20      if length(l_dir_name) > l_max_len then
 21        select substr(l_dir_name,1,l_max_len-9)||'_'||substr(standard_hash(l_dir_name,'MD5'),1,8)
 22        into   l_dir_name
 23        from   dual;
 24      end if;
 25    end if;
 26    return l_dir_name;
 27  end;
 28  /

Function created.

Now I can look at some of the sample directory names I intend to create. The trailing hash value is designed to ensure uniqueness whilst keeping the directory names under 30 characters.



SQL> select dirpath, path_to_dirname(dirpath) from filelist
  2  where dirpath != 'X:\oracle\doc\21'  -- the root
  3  and length(dirpath) > 50
  4  and rownum < 10;

DIRPATH
----------------------------------------------------------------------------------------------------------------------------------
PATH_TO_DIRNAME(DIRPATH)
----------------------------------------------------------------------------------------------------------------------------------
X:\oracle\doc\21\acfsg\sp_common\book-template\ohc-common\menu-content
XDOC21_ACFSG_SP_COMMO_DDAE7E66

X:\oracle\doc\21\acfsg\sp_common\book-template\ohc-common\menu-content
XDOC21_ACFSG_SP_COMMO_DDAE7E66

X:\oracle\doc\21\acfsg\sp_common\book-template\ohc-common\menu-content
XDOC21_ACFSG_SP_COMMO_DDAE7E66

X:\oracle\doc\21\acfsg\sp_common\book-template\ohc-common\menu-content
XDOC21_ACFSG_SP_COMMO_DDAE7E66

X:\oracle\doc\21\acfsg\sp_common\book-template\ohc-common\menu-content
XDOC21_ACFSG_SP_COMMO_DDAE7E66

X:\oracle\doc\21\acfsg\sp_common\book-template\ohc-common\menu-content
XDOC21_ACFSG_SP_COMMO_DDAE7E66

X:\oracle\doc\21\acfsg\sp_common\book-template\ohc-common\menu-content
XDOC21_ACFSG_SP_COMMO_DDAE7E66

X:\oracle\doc\21\acfsg\sp_common\book-template\ohc-common\menu-content
XDOC21_ACFSG_SP_COMMO_DDAE7E66

X:\oracle\doc\21\acfsg\sp_common\book-template\ohc-common\menu-content
XDOC21_ACFSG_SP_COMMO_DDAE7E66

Now I can create directory objects for each of the distinct folders under my documentation tree.



SQL> begin
  2    for i in (
  3      with all_dirs as
  4      ( select /*+ materialize */ distinct dirpath
  5        from filelist
  6      )
  7      select dirpath, path_to_dirname(dirpath) ora_dir
  8      from all_dirs
  9    )
 10    loop
 11      execute immediate 'create or replace directory '||i.ora_dir||' as '''||i.dirpath||'''';
 12    end loop;
 13  end;
 14  /

PL/SQL procedure successfully completed.

All up, this came to approximately 600 directories.



SQL> select count(*) from dba_directories;

  COUNT(*)
----------
       607

SQL> select directory_name
  2  from   dba_directories
  3  where  directory_name like 'XDOC%';

DIRECTORY_NAME
--------------------------------------------------------------------------------------------------------------------------------
XDOC21
XDOC21_ACFSG
XDOC21_ACFSG_IMG_TEXT
XDOC21_ACFSG_SP_COMMO_6EC8D9A5
XDOC21_ACFSG_SP_COMMO_DDAE7E66
XDOC21_ADDCI
XDOC21_ADDCI_IMG_TEXT
XDOC21_ADDCI_SP_COMMO_CD64DFA8
XDOC21_ADDCI_SP_COMMO_F1F64659
XDOC21_ADFNS
XDOC21_ADFNS_IMG_TEXT
XDOC21_ADFNS_SP_COMMO_31E34091
XDOC21_ADFNS_SP_COMMO_B0EEC11B
XDOC21_ADJSN
XDOC21_ADJSN_IMG_TEXT
XDOC21_ADJSN_SP_COMMO_45CD085B
XDOC21_ADJSN_SP_COMMO_EA0944E5
XDOC21_ADLOB
XDOC21_ADLOB_IMG_TEXT
XDOC21_ADLOB_SP_COMMO_9EC5D0CD
XDOC21_ADLOB_SP_COMMO_C1E530A8
...
...

Now that I have a listing of every file in the documentation set, and a database directory object for every OS subdirectory where those files resides, I can create a table holding that complete set of information, using a BFILE to reference the documentation file.



SQL> create table ora_doc_files
  2  ( id int generated as identity,
  3    fname varchar2(512),
  4    htm bfile );

Table created.

SQL>
SQL> insert into ora_doc_files (fname,htm )
  2  select fname,bfilename( path_to_dirname(dirpath),ltrim(substr(fname,instr(fname,'\',-1)),'\'))
  3  from filelist;

16154 rows created.

SQL>
SQL> commit;

Commit complete.

My database can now reference any of those files, so I can create a Text index on the table. I’ll add some preferences, the main one for me being that underscore should not be a word separator (so for example, “DBMS_OUTPUT” is not treated as “DBMS” and “OUTPUT”). You of course could customise theses preferences however you please to suit your search usage patterns.



SQL> begin
  2    ctxsys.ctx_ddl.create_preference('ora_doc_files_ix_fil','null_filter');
  3    ctxsys.ctx_ddl.create_section_group('ora_doc_files_ix_sgp','null_section_group');
  4    ctxsys.ctx_ddl.create_preference('ora_doc_files_ix_lex','basic_lexer');
  5    ctxsys.ctx_ddl.set_attribute('ora_doc_files_ix_lex','printjoins','_');
  6    ctxsys.ctx_ddl.set_attribute('ora_doc_files_ix_lex','base_letter','yes');
  7    ctxsys.ctx_ddl.set_attribute('ora_doc_files_ix_lex','mixed_case','no');
  8    ctxsys.ctx_ddl.create_preference('ora_doc_files_ix_wdl','basic_wordlist');
  9    ctxsys.ctx_ddl.set_attribute('ora_doc_files_ix_wdl','stemmer','english');
 10    ctxsys.ctx_ddl.set_attribute('ora_doc_files_ix_wdl','fuzzy_match','generic');
 11    ctxsys.ctx_ddl.create_stoplist('ora_doc_files_ix_spl','basic_stoplist');
 12  end;
 13  /

PL/SQL procedure successfully completed.

And then I create the Text index and my documentation set is now searchable!



SQL> create index ora_doc_files_ix on ora_doc_files(htm) indextype is ctxsys.context
  2    parameters('
  3      filter          ora_doc_files_ix_fil
  4      section group   ora_doc_files_ix_sgp
  5      lexer           ora_doc_files_ix_lex
  6      wordlist        ora_doc_files_ix_wdl
  7      stoplist        ora_doc_files_ix_spl
  8    ')
  9  /

Index created.

Let’s give it a try with some simple search terms



SQL> select fname
  2  from ora_doc_files
  3  where contains( htm, 'DBMS_OUTPUT' ) > 0;

FNAME
-----------------------------------------------------------------------
X:\oracle\doc\21\addci\example-for-power-demand-cartridge.html
X:\oracle\doc\21\addci\example-for-PSBTREE-extensible-indexing.html
X:\oracle\doc\21\addci\implementing-data-cartridges-in-PL-SQL.html
X:\oracle\doc\21\addci\working-with-multimedia-data-types.html

SQL> select fname
  2  from ora_doc_files
  3  where contains( htm, 'NVL2',1 ) > 0
  4  order by score(1) desc;

FNAME
---------------------------------------------------------------------------
X:\oracle\doc\21\sqlrf\NVL2.html
X:\oracle\doc\21\oladm\olap-dml-functions-l-z.html
X:\oracle\doc\21\olapi\oracle\olapi\syntax\FunctionDescriptorCatalog.html
X:\oracle\doc\21\sqlqr\SQL-Functions.html
X:\oracle\doc\21\olaxs\row-functions.html

That is all looking good, but its not particularly useful for me that every time I want to search the documentation, I need to fire up a SQL window and run a query. I’d like something that aligns more closely to the online doc search experience.

No problems there – APEX can do that very nicely. A simple Classic Report and a redirect of the file location so that my ORDS instance can serve up the file and we’re good to go.

image

 

doc_search

And there you have it! Blazing fast documentation search on your own machine, and of course, totally customisable to allow the full power of Oracle Text in whatever best suits your search needs.

3 comments

  1. I think I’m lost how can You search contents of file having bflie column in table instead of loading the file contents to lob column.
    Is bfile type only pointer to file physical location in directory?
    Having said that without coffee, standard disclaimer :).
    Regards
    Greg

    1. Because we create the index on the BFILE column, the database “knows” that our intention is to dig *into* the LOB that is referenced by it.
      Other options include the URL_DATASTORE option for columns which contain a web link.

Leave a Reply to Connor McDonald Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.