using SSIS 2012. My flat file connection manager I have a delimited file where the row delimiter is set to CRLF
, but when it processes the file, I have a text c
I have no SSIS experience but as an ETL developer I have faced this many times. So my suggestions might not help you solve the problem but hopefully point you in the right direction
I had a similar issue to this. I had a CSV file with LF as the terminator. However, the client also had CRLF in two of the columns and this was causing the "delimiter for column is not found" error.
It took me a few days of googling solutions and trial and error, but I got it working.
In the end, I needed two script components.
In the first Script component, I had a column named Output0 string with Length of 4000. In the script (see below) I used ReadToEnd to load the data, replace the CRLF with an empty string, and then spliting into rows with the LF as the terminator.
using System.IO;
using System.Text;
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
private StreamReader textReader;
private string collateralFile;
public override void AcquireConnections(object Transaction)
{
IDTSConnectionManager100 connMgr = this.Connections.Collateral;
collateralFile = (string)connMgr.AcquireConnection(null);
}
public override void PreExecute()
{
base.PreExecute();
}
public override void CreateNewOutputRows()
{
StreamReader textReader = new StreamReader(collateralFile);
string collatFile = textReader.ReadToEnd();
collatFile = collatFile.Replace("\r\n", " ");
String[] lines = collatFile.Split(new char[] { '\n' });
textReader.Close();
string nextLine;
for (int i = 0; i < lines.Length; i++)
{
if (lines[i] != null)
{
nextLine = lines[i];
if (!String.IsNullOrEmpty(nextLine))
{
Output0Buffer.AddRow();
Output0Buffer.Output0 = nextLine;
}
}
}
}
}
I tried splitting it again into columns, but it returned null values, so in the second script component I created my columns and loaded the data into them in the script.
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
String[] columns = Row.Output0.Split(',');
Row.Description = columns[0];
Row.LegalDescription = columns[1];
Row.Address1ParsedLine1 = columns[2];
Row.Address1ParsedLine2 = columns[4];
Row.Address1ParsedCityname = columns[5];
Row.Address1ParsedStatecode = columns[6];
Row.Address1ParsedPostalcode = columns[7];
}
In your Flat File Connection Manager component you have a property that I forgot its name, in it you can set the row delimiter ({CR}{LF}
, {LF}
, {CR}
, ...etc).
Please try to adjust this property I think it'll work.
Before answering, i don't think that the column contains only LF
because if the row delimiter is CRLF
it will not consider it as delimiter. So it is probably CRLF
, but i will give a solution for the two cases (CRLF or LF)
You can fix this situation with the following steps:
DT_STR
and length 4000
) so you will consider each row as one column.I will consider a flat file with the following content
ID;name;DOB;Notes;ClassID{CRLF}
1;John;2001-01-01;;1{CRLF}
2;Moh;2002-01-01;Very cool{LF}
Genius;2{CRLF}
3;Ali;2000-01-01;Calm;2{CRLF}
In the DataFlow Task i will add a Flat File Source
, 2 x Script Component
, OLEDB Destination
In the first Script Component i will mark Column0
as input and i will add 5 output Columns ID,Name,DOB,Notes,ClassID
and i will set the Output Synchronous Input as None
In the first Script Component i will write a script that store each line in a memory variable and assign it to an output row when row is complete and another row is present.
Dim strLine As String = String.Empty
Dim strDelimiter As String = ";"
Public Sub EmptyMemoryVariables()
strLine = String.Empty
End Sub
Public Sub AssignMemoryVariablesToOutput()
With Output0Buffer
.AddRow()
.NewRow = strLine
End With
End Sub
Public Function AreVariablesEmpty() As Boolean
If strLine = "" Then
Return True
Else
Return False
End If
End Function
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
Dim strColumns As String() = Row.Column0.Split(CChar(strDelimiter))
If strColumns.Length = 5 Then
If Not AreVariablesEmpty() Then
AssignMemoryVariablesToOutput()
EmptyMemoryVariables()
End If
strLine = Row.Column0
AssignMemoryVariablesToOutput()
EmptyMemoryVariables()
Else
If strLine.Split(CChar(strDelimiter)).Length = 5 Then
AssignMemoryVariablesToOutput()
EmptyMemoryVariables()
End If
strLine &= Row.Column0
End If
In the second Script COmponent i will split each row into Columns
Dim strDelimiter As String = ";"
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
Dim strColumns As String() = Row.NewRow.Split(CChar(strDelimiter))
Row.ID = strColumns(0)
Row.NAME = strColumns(1)
Row.DOB = strColumns(2)
Row.NOTES = strColumns(3)
Row.CLASSID = strColumns(4)
End Sub
Important Note: the provided code is not optimal it may need more validations or can be simpler and better but i am trying to give you the way you can think to solve this issue
thank u for all the suggestions. turned out that the vendor had changed the encoding of the file from Ascii to unicode. changing the the package to read the correct encoding did the trick.