Here’s a quick tip for anyone tinkering with vectors and vector indexes in 23ai. Let’s say you have just created a vector index on your table as per below


SQL> create vector index DOC_CHUNKS_VEC_IX on DOC_CHUNKS(piece_embed) 
  2  organization INMEMORY NEIGHBOR GRAPH
  3  distance COSINE
  4  with target accuracy 95;

Index created.  

and now you would like to grab the DDL for that index. You fire up your trusty DBMS_METADATA script and …. hmmmmm


SQL> select dbms_metadata.get_ddl('INDEX','DOC_CHUNKS_VEC_IX') from dual;

DBMS_METADATA.GET_DDL('INDEX','DOC_CHUNKS_VEC_IX')
--------------------------------------------------------------------------------
CREATE INDEX "CONNOR"."DOC_CHUNKS_VEC_IX" ON "MCDONAC"."DOC_CHUNKS" ("PIECE_EMBED")

This is because the object you have created is not an index, it is a vector index. You need to adjust the object type that you pass in.


SQL> select dbms_metadata.get_ddl('VECTOR_INDEX','DOC_CHUNKS_VEC_IX') from dual;

DBMS_METADATA.GET_DDL('VECTOR_INDEX','DOC_CHUNKS_VEC_IX')
--------------------------------------------------------------------------------
CREATE VECTOR INDEX "CONNOR"."DOC_CHUNKS_VEC_IX" ON "CONNOR"."DOC_CHUNKS" ("PIECE_EMBED")
ORGANIZATION INMEMORY NEIGHBOR GRAPH
WITH DISTANCE COSINE WITH TARGET ACCURACY 95
PARAMETERS (TYPE HNSW,  NEIGHBORS 32, EFCONSTRUCTION 300) PARALLEL 1

Officially this is not a bug in DBMS_METADATA, but it is likely that we’ll make a tweak to the API to return an error if you pass in INDEX for a vector index just so there is no chance of confusion.

Been playing with vectors? Let me know in the comments how you’re going.

4 responses to “Quick Tip – Vector Indexes in 23ai”

  1. isdba39bfa2f000 Avatar
    isdba39bfa2f000

    Hi Connor,

    We plan to create a vector index on a 17Tb document LOB when we upgrade to 23ai. Is there a simple way to estimate the index size before we carry out the work?

    Regards

    David

    1. Tough question because presumably – first you’ll chunk the the LOB, so you might have (say) 1 row per 500 characters, so that’s a copy of the table (but now in chunks). Each row will then have vector, which is loosely 1k per row. So you pay that price to ‘vectorise’ your data, without an indexes.

      The vector index is like a text index, ie, it creates secondary tables as well as indexes, which look to be about 20-30 bytes per chunked row

  2. isdba39bfa2f000 Avatar
    isdba39bfa2f000

    Thanks Connor

  3. […] partitioning: You can create indexes in parallel by adding a PARALLEL n clause like other indexes​connor-mcdonald.com (useful for large data indexing). Global (table) partitioning of the index is also supported; an […]

Leave a reply to isdba39bfa2f000 Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Trending