Recently I was doing a simple external table load using a CSV file, but was getting an interesting error. My file looked simple enough
"ID","EMAIL","TIMESTAMP","SUBJECT","STATUS","STATUS_TS"
"2012348048","john@anon.com","05/02/2000","Subject 1","5","09/04/2007"
"2412348048","mike@anon.com","05/02/2000","Subject 1","5","09/16/2002"
"348543169051","sue@anon.com","03/10/2001","Subject 1","5","03/24/2008"
"348396029762","mary@anon.com","03/10/2001","Subject 1","5","03/10/2001"
"1212348047","sam@anon.com","05/02/2000","Subject 1","5","05/02/2000"
"1612348048","vincent@anon.com","05/02/2000","Subject 1,"5","06/02/2006"
...
...
So it should have been a fairly straightforward external table definition to access it
SQL> create table T
2 ( ID int
3 ,EMAIL varchar2(255)
4 ,TIMESTAMP date
5 ,SUBJECT varchar2(512)
6 ,STATUS int
7 ,STATUS_TS date
8 )
9 organization external
10 ( default directory TMP
11 access parameters
12 ( records delimited by newline
13 fields terminated by ',' optionally enclosed by '"'
14 ( ID, EMAIL, "TIMESTAMP" DATE 'MM/DD/YYYY', SUBJECT, STATUS, STATUS_TS DATE 'MM/DD/YYYY'
15 )
16 )
17 location ('my_report.csv')
18 )
19 ;
Table created.
At which point, things went downhill rapidly 🙂 When I queried the external table, I got the following error
SQL> select * from T;
select * from T;
*
ERROR at line 1:
ORA-29913: error in executing ODCIEXTTABLEFETCH callout
ORA-29400: data cartridge error
KUP-04020: found record longer than buffer size supported,
524288, in c:\temp\my_report.csv (offset=1048576)
The clue here is that a record was seen to be longer than allowed. Since I know all my “records” are quite short, it must be the definition of what ends a record that is causing a problem. problem. In my case, the file had come from a system with a different characterset, and hence the concept of “newline” differs between characterset/target platform. Once I included that information into the external table definition, all was fine.
SQL> create table T
2 ( ID int
3 ,EMAIL varchar2(255)
4 ,TIMESTAMP date
5 ,SUBJECT varchar2(512)
6 ,STATUS int
7 ,STATUS_TS date
8 )
9 organization external
10 ( default directory TMP
11 access parameters
12 ( records delimited by '\n' characterset AL32UTF8
13 fields terminated by ',' optionally enclosed by '"'
14 ( ID, EMAIL, "TIMESTAMP" DATE 'MM/DD/YYYY', SUBJECT, STATUS, STATUS_TS DATE 'MM/DD/YYYY'
15 )
16 )
17 location ('my_report.csv')
18 )
19 ;
Table created.
SQL> select * from T where rownum < 10;
...
16,308 rows selected.
The power of the Oracle’s external table feature is just awesome. No one in my organization believed that I could take files from an IBM Mainframe (EBCDIC), binary sftp them to an HP-UX server (ASCII) and load them via an external table definition using the ORACEL_LOADER access driver and the proper characterset like they were plain text files. This was in 11R2.