I am loading a .csv file data into oracle table through sql loader. One of the fields has a new line character (CRLF) in its data and so, am getting the below error:
You can use replace(replace(column_name, chr(10)), chr(13))
to remove newline charactors or regexp_replace(column_name, '\s+')
to remove non printable charactors during loading
I found the best way to load the .csv files with fields containing newline and comma.Please run the macro over the .csv file and then load using sqlloader
Sub remove()
Dim row As Integer
Dim oxcel As Excel.Application
Dim wbk As Excel.Workbook
Set oxcel = New Excel.Application
Set wbk = oxcel.Workbooks.Open("filename.csv", 0, True)
row = 0
With oxcel
.ActiveSheet.Select
Do
row = row + 1
'Assume first column is PK and so checking for empty pk to find the number of rows
Loop Until IsEmpty(Cells(row, 1)) Or IsNull(Cells(row, 1))
Range(Cells(1, 24), Cells(row - 1, 24)).Select
For Each oneCell In Selection
oneCell.Value = Application.Substitute(Application.Substitute
(Application.Substitute (CStr(oneCell.Value), vbLf, vbCr), vbCr, "-"),",","-")
Next oneCell
End With
End Sub
It's running perfect for me.
load data
characterset UTF8
infile 'C:\Users\lab.csv'
truncate
into table test_labinal
fields terminated by ";" optionally enclosed by '"'
TRAILING NULLCOLS
(
STATEMENT_STATUS ,
MANDATORY_TASK ,
COMMENTS CHAR(9999) "SubStr(REPLACE(REPLACE(:Comments,CHR(13)),CHR(10)), 0, 1000)"
)
Note: The CHR(13)
is the ASCII character for "carriage return" and the CHR(10)
is the ASCII character for "new line". Using the Oracle PL/SQL REPLACE
command without a replacement value will remove any "carriage return" and/or "new line" character that is embedded in your data. Which is probably the case because the comment field is the last field in your CSV file.
If your last field is always present (though trailing nullcols
suggests it isn't) and you have some control over the formatting, you can use the CONTINUEIF directive to treat the second line as part of the same logical record.
If the comments
field is always present and enclosed in double-quotes then you can do:
...
truncate
continueif last != x'22'
into table ...
Which would handle data records like:
S;Y;"Test 1"
F;N;"Test 2"
P;Y;"Test with
new line"
P;N;""
Or if you always have a delimiter after the comments field, whether it is populated or not:
...
truncate
continueif last != ';'
into table ...
Which would handle:
S;Y;Test 1;
F;N;"Test 2";
P;Y;Test with
new line;
P;N;;
Both ways will load the data as:
S M COMMENTS
- - ------------------------------
S Y Test 1
F N Test 2
P Y Test withnew line
P N
But this loses the new line from the data. To keep that you need the terminating field delimiter to be present, and instead of CONTINUEIF
you can change the record separator using the stream record format:
...
infile 'C:\Users\lab.csv' "str ';\n'"
truncate
into table ...
The "str ';\n'"
defines the terminator as the combination of the field terminator and a new line character. Your split comment only has that combination on the final line. With the same data file as the previous version, this gives:
S M COMMENTS
- - ------------------------------
S Y Test 1
F N Test 2
P Y Test with
new line
P N
4 rows selected.
Since you're on Windows you might have to include \r
in the format as well, e.g. "str ';\r\n'"
, but I'm not able to check that.