I have a VBA application that runs every day. It checks a folder where CSVs are downloaded automatically, and adds their contents to a database. When parsing them, I reali
I realize this is an old post, but I just bumped into it looking for a solution to the same problem the OP had, so the thread is still relevant.
To import data from a CSV, I add a query to a worksheet
wksTarget.Querytables.add(Connection:=strConn, Destination:=wksTarget.Range("A1"))
then set the appropriate Querytable parameters (e.g. Name, FieldNames, RefreshOnOpen
, etc.)
Querytables can handle various delimiters via the TextFileCommaDelimiter
, TextFileSemiColonDelimiter
and others. And there are a number of other parameters (TextfilePlatform, TextFileTrailingMinusNumbers, TextFileColumnTypes, TextFileDecimalSeparator, TextFileStartRow, TextFileThousandsSeparator
) that handle source file idiosyncrasies.
Relevant to the OP, QueryTables also has a parameter designed to handle commas that are within double quotes - TextFileQualifier = xlTextQualifierDoubleQuote
.
I find QueryTables much simpler than writing code to import the file, split/parse strings or use REGEX expressions.
All together, a sample code snippet would look something like this:
strConn = "TEXT;" & "C:\Desktop\SourceFile.CSV"
varDataTypes = Array(5, 1, 1, 1, 1, 1, 5, 5)
With wksTarget.QueryTables.Add(Connection:=strConn, _
Destination:=wksTarget.Range("A1"))
.Name = "ImportCSV"
.FieldNames = True
.RefreshOnFileOpen = False
.SaveData = True
.TextFilePlatform = xlMSDOS
.TextFileStartRow = 1
.TextFileParseType = xlDelimited
.TextFileCommaDelimiter = True
.TextFileTextQualifier = xlTextQualifierDoubleQuote
.TextFileColumnDataTypes = varDataTypes
.Refresh BackgroundQuery:=False
End With
I prefer to delete the QueryTable once the data is imported (wksTarget.QueryTable("ImportCSV").Delete
), but I suppose it could be created just once and then simply refreshed if the source and destinations for the data don't change.
If the source CSV has every field in double quotes, then split(strLine, """, """) may work well
I know this is an old post, but thought this may help others. This was plagiarized/revised from http://n3wt0n.com/blog/comma-separated-values-and-quoted-commas-in-vbscript/, but works really well and is set as a function that you can pass your input line to.
Function SplitCSVLineToArray(Line, RemoveQuotes) 'Pass it a line and whether or not to remove the quotes
ReplacementString = "#!#!#" 'Random String that we should never see in our file
LineLength = Len(Line)
InQuotes = False
NewLine = ""
For x = 1 to LineLength
CurrentCharacter = Mid(Line,x,1)
If CurrentCharacter = Chr(34) then
If InQuotes then
InQuotes = False
Else
InQuotes = True
End If
End If
If InQuotes Then
CurrentCharacter = Replace(CurrentCharacter, ",", ReplacementString)
End If
NewLine = NewLine & CurrentCharacter
Next
LineArray = split(NewLine,",")
For x = 0 to UBound(LineArray)
LineArray(x) = Replace(LineArray(x), ReplacementString, ",")
If RemoveQuotes = True then
LineArray(x) = Replace(LineArray(x), Chr(34), "")
End If
Next
SplitCSVLineToArray = LineArray
End Function
I made another variant of solution for parsing CSV files with "quoted" text strings with possible delimiters, like comma inside the double quotes. This method doesn't require regex expressions, or any other addons. Also, this code deals with multiple commas in between the quotes. Here is Subroutine for testing:
Sub SubstituteBetweenQuotesSub()
'In-string character replacement function by Maryan Hutsul 1/29/2019
Dim quote, quoteTwo As Integer
Dim oddEven As Integer
Dim i, counter As Integer
Dim byteArray() As Byte
'LineItems are lines of text read from CSV file, or any other text string
LineItems = ",,,2019NoApocalypse.ditamap,jesus.christ@sky.com,Approver,""JC, ,Son"",Reviewer,god.allmighty@sky.com,""God, All-Mighty,"",2019-01-29T08:47:29.290-05:00"
quote = 1
oddEven = 0
Do Until quote = 0
quote = InStr(quote, LineItems, Chr(34))
quoteTwo = InStr(quote + 1, LineItems, Chr(34))
oddEven = oddEven + 1
If oddEven Mod 2 = 1 And quote <> 0 Then
counter = 0
For i = quote To quoteTwo
byteArray = StrConv(LineItems, vbFromUnicode)
If i <> 0 Then
If byteArray(i - 1) = 44 Then '44 represents comma, can also do Chr(44)
counter = counter + 1
End If
End If
Next i
LineItems = Left(LineItems, quote - 1) & Replace(LineItems, ",", ";", quote, counter)
quote = quote + 1
ElseIf quote <> 0 Then
quote = quote + 1
End If
Loop
End Sub
Here is function to which you can pass lines from .csv, .txt or any other text files:
Function SubstituteBetweenQuotes(LineItems)
'In-string character replacement function by Maryan Hutsul 1/29/2019
'LineItems are lines of text read from CSV file, or any other text string
Dim quote, quoteTwo As Integer
Dim oddEven As Integer
Dim i, counter As Integer
Dim byteArray() As Byte
quote = 1
oddEven = 0
Do Until quote = 0
quote = InStr(quote, LineItems, Chr(34))
quoteTwo = InStr(quote + 1, LineItems, Chr(34))
oddEven = oddEven + 1
If oddEven Mod 2 = 1 And quote <> 0 Then
counter = 0
For i = quote To quoteTwo
byteArray = StrConv(LineItems, vbFromUnicode)
If i <> 0 Then
If byteArray(i - 1) = 44 Then '44 represents "," comma, can also do Chr(44)
counter = counter + 1
End If
End If
Next i
LineItems = Left(LineItems, quote - 1) & Replace(LineItems, ",", ";", quote, counter)
quote = quote + 1
ElseIf quote <> 0 Then
quote = quote + 1
End If
Loop
SubstituteBetweenQuotes = LineItems
End Function
And below is code for reading CSV file with function used:
Dim fullFilePath As String
Dim i As Integer
'fullFilePath - full link to your input CSV file
Open fullFilePath For Input As #1
row_number = 0
column_number = 0
'EOF - End Of File (1) - file #1
Do Until EOF(1)
Line Input #1, LineFromFile
LineItems = Split(SubstituteBetweenQuotes(LineFromFile), ",")
For i = LBound(LineItems) To UBound(LineItems)
ActiveCell.Offset(row_number, i).Value = LineItems(i)
Next i
row_number = row_number + 1
Loop
Close #1
All delimiters and replacement character may be modified for your needs. I Hope this is useful as I had quite a journey to solve some problems with CSV imports
Try This! Make sure to have the "Microsoft VBScript Regular Expressions 5.5" ticked on References under Tools.
Function Splitter(line As String, n As Integer)
Dim s() As String
Dim regex As Object
Set regex = CreateObject("vbscript.regexp")
regex.IgnoreCase = True
regex.Global = True
regex.Pattern = ",(?=([^\""]*\""[^\""]*\"")*[^\""]*$)"
s = split(regex.Replace(line, "|/||\|"), "|/||\|")
Splitter = s(n - 1)
End Function