Import tab separated CSV, tab delimiter not recognized

我们两清 提交于 2021-01-29 06:04:02

问题


I have tab separated csv files which I want to transform to xlsx. So each csv should be transformed to a xlsx. Filename should be the same. However, the files are tab separated. For example, see this test file screenshot:

When I run my code (I created a subfolder xlsx before):

Sub all()

   Dim sourcepath As String
   Dim sDir As String
   Dim newpath As String
    
    sourcepath = "C:\Users\PC\Desktop\Test\"
    newpath = sourcepath & "xlsx\"
    
    'make sure subfolder xlsx was created before

    sDir = Dir$(sourcepath & "*.csv", vbNormal)
    Do Until Len(sDir) = 0
        Workbooks.Open (sourcepath & sDir)
        With ActiveWorkbook
            .SaveAs Filename:=Replace(Left(.FullName, InStrRev(.FullName, ".")), sourcepath, newpath) & "xlsx", FileFormat:=xlOpenXMLWorkbook, CreateBackup:=False
            .Close
        End With
        
        sDir = Dir$
    Loop
End Sub

it does work, but when I look into the excel file:

I can see that the tab separator was not detected. I think my local settings are that the separator is a semi-colon and that's why it is not working. Now I wanted to set dataType to xlDelimited and tab to True, with changing the one line to:

Workbooks.Open (Spath & sDir), DataType:=xlDelimited, Tab:=True

I also tried

Workbooks.Open (Spath & sDir, DataType:=xlDelimited, Tab:=True)

or

Workbooks.Open FileName:=Spath & sDir, DataType:=xlDelimited, Tab:=True

But this leads to an error message. I then tried another approach, where I set the delimiter to Chr(9) (tab) and local to false:

Sub all()

    Dim wb As Workbook
    Dim strFile As String
    Dim strDir As String
    
    strDir = "C:\Users\PC\Desktop\Test\"
    strFile = Dir(strDir & "*.csv")
    
    Do While strFile <> ""
    
    Set wb = Workbooks.Open(Filename:=strDir & strFile, Delimiter:=Chr(9), Local:=False)
    With wb
        .SaveAs Replace(wb.FullName, ".csv", ".xlsx"), 51
        .Close True
    End With
    Set wb = Nothing
    strFile = Dir
    Loop
    
End Sub

It does not lead to an error. But when I open the file, it looks like:

So again same problem, the tab separator is not recognized. How can I fix this?

(I also tried it with local:True together with Delimiter:=Chr(9), but same problem and I also tried it with adding Format:=6)

I tried it this way with csv as I did not want to go the same way with txt file extension. Reason is that using csv easily allows special language characters like "ö" and "ü". So that is why I wanted to convert csv to xlsx and not use the workaround of using txt instead, as I then run into the problem that when I try to convert txt to xlsx certain special characters are not properly recognised and I hope to avoid this problem with using csv.

The csv (or actually these are tsv, because they have the tab as separator and not semi-colon) files have different columns. So could be one csv file has 5 columns, the other 6 and the datatypes vary too.

EDIT:

In repsonse to EEMs answer:

Check this Test.csv file, it looks like this:

Separated by tab. Not semi-colon.

When I run the code (plus adding .TextFileDecimalSeparator = "." to the code) and check the resulting xlsx file it looks like this:

Values in the second column (ColumnÄ), like 9987.5 are correctly transformed to 9987,5. But values in the last column (ColumnI) are wrongly transformed. This is my problem now. (I dont know why special character does not get transformed correctly, as in my original files this does work.)


回答1:


As mentioned by @RonRosenfeld, files with .csv extension will be opened by excel as a text file with tab delimiters.

Also the following assumption is not accurate:

Option1 txt is not a way for me, as I face a new problem that UTF-8 special characters, like äö and so are not properly imported.

The handling of the characters or code page has nothing to do with the extension of the files, instead it's managed by the Origin parameter of the Workbooks.OpenText method or by the TextFilePlatform property of the QueryTable object.

Therefore unless the files are renamed with a extension different than csv the [Workbooks.OpenText method] will not be effective.

The solution proposed below, uses the QueryTable object and consist of two procedures:

  1. Tab_Delimited_UTF8_Files_Save_As_Xlsx
  • Sets the source and target folder
  • Creates the xlsx folder if not present
  • Gets all csv files in the source folder
  1. Open_Csv_As_Tab_Delimited_Then_Save_As_Xls
  • Process each csv files
  • Adds a workbook to hold the Query Table
  • Imports the csv file
  • Deletes the Query
  • Saves the File as `xlsx'

EDIT I These lines were added to ensure the conversion of the numeric data:

                .TextFileDecimalSeparator = "."
                .TextFileThousandsSeparator = ","

EDIT II A Few changes to rename the worksheet (marked as '@ )

Tested with this csv file:

Generated this `xlsx' file:

Hopefully, it should be straightforward to add these procedure to you project, let me know of any problem or question you might have with the resources used.

Sub Tab_Delimited_UTF8_Files_Save_As_Xlsx()
Dim sFile As String
Dim sPathSrc As String, sPathTrg As String
Dim sFilenameSrc As String, sFilenameTrg As String
Dim bShts As Byte, exCalc As XlCalculation
    
Rem sPathSrc = "C:\Users\PC\Desktop\Test\"
    sPathTrg = sPathSrc & "xlsx\"
    
    Rem Excel Properties OFF
    With Application
        .EnableEvents = False
        .DisplayAlerts = False
        .ScreenUpdating = False
        exCalc = .Calculation
        .Calculation = xlCalculationManual
        .CalculateBeforeSave = False
        bShts = .SheetsInNewWorkbook
        .SheetsInNewWorkbook = 1
    End With

    Rem Validate Target Folder
    If Len(Dir$(sPathTrg, vbDirectory)) = 0 Then MkDir sPathTrg

    Rem Process Csv Files
    sFile = Dir$(sPathSrc & "*.csv")
    Do Until Len(sFile) = 0
        
        sFilenameSrc = sPathSrc & sFile
        sFile = Left(sFile, -1 + InStrRev(sFile, ".csv"))    '@
        sFilenameTrg = sPathTrg & sFile & ".xlsx"            '@
        
        Call Open_Csv_As_Tab_Delimited_Then_Save_As_Xls(sFile, sFilenameSrc, sFilenameTrg)    '@
        
        sFile = Dir$
    
    Loop

    Rem Excel Properties OFF
    With Application
        .SheetsInNewWorkbook = bShts
        .Calculation = exCalc
        .CalculateBeforeSave = True
        .ScreenUpdating = True
        .DisplayAlerts = True
        .EnableEvents = True
    End With
    
    End Sub

Sub Open_Csv_As_Tab_Delimited_Then_Save_As_Xls(sWsh As String, sFilenameSrc As String, sFilenameTrg As String)    '@
Dim Wbk As Workbook
    
    Rem Workbook - Add
    Set Wbk = Workbooks.Add
    With Wbk
        
        With .Worksheets(1)

            Rem QueryTable - Add
            With .QueryTables.Add(Connection:="TEXT;" & sFilenameSrc, Destination:=.Cells(1))
                
                Rem QueryTable - Properties
                .SaveData = True
                .TextFileParseType = xlDelimited
                .TextFileDecimalSeparator = "."
                .TextFileThousandsSeparator = ","
                .TextFileConsecutiveDelimiter = False
                .TextFileTabDelimiter = True
                .TextFileTrailingMinusNumbers = True
                .TextFilePlatform = 65001               'Unicode (UTF-8)
                .Refresh BackgroundQuery:=False
                
                Rem QueryTable - Delete
                .Delete
            
            End With
        
            Rem Rename Worksheet    '@
            On Error Resume Next    '@ Ignore error in case the Filename is not valid as Sheetname
            .Name = sWsh            '@
            On Error GoTo 0         '@
        
        End With

        Rem Workbook - Save & Close
        .SaveAs Filename:=sFilenameTrg, FileFormat:=xlOpenXMLWorkbook
        .Close
    
    End With

    End Sub



回答2:


With a delimited file that has a csv extension, Excel's Open and vba Workbooks.Open and Workbooks.OpenText methods will always assume that the delimiter is a comma, no matter what you put into the argument.

  1. You can change the file extension (eg to .txt), and then the .Open method should work.
  2. You could read it into a TextStream object and parse it line by line in VBA
  3. You can Import the file rather than Opening the file.
  • You could use Power Query to import it.
  • Or you could use a variation of the code below, which was just generated by the macro recorder, so you'll have to clean it up and adapt it a bit to your specifics.
With ActiveSheet.QueryTables.Add(Connection:= _
        "TEXT;D:\Users\Ron\Desktop\myFile.csv", Destination:=Range("$A$12"))
        .CommandType = 0
        .Name = "weather_1"
        .FieldNames = True
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .TextFilePromptOnRefresh = False
        .TextFilePlatform = 437
        .TextFileStartRow = 1
        .TextFileParseType = xlDelimited
        .TextFileTextQualifier = xlTextQualifierDoubleQuote
        .TextFileConsecutiveDelimiter = False
        .TextFileTabDelimiter = True
        .TextFileSemicolonDelimiter = False
        .TextFileCommaDelimiter = False
        .TextFileSpaceDelimiter = False
        .TextFileColumnDataTypes = Array(1, 1, 1, 1)
        .TextFileTrailingMinusNumbers = True
        .Refresh BackgroundQuery:=False
    End With



回答3:


This method uses File System Object and Text Stream to change from tab delimited to comma delimited. Select Microsoft Scripting Runtime inside of Tools/References to make available the library used. If semicolon delimited also does not open correctly, you can replace third parameter of REPLACE function in line fileContents = Replace(fileContents, vbTab, ";") with comma and try again. Note we are creating a .csv file with Set textStreamObj = fileSystemObj.CreateTextFile("filePath2.csv"), not overwriting our original.

Option Explicit

Sub changeDelimitedMarker()
    Dim fileSystemObj As Scripting.FileSystemObject
    Dim textStreamObj As Scripting.TextStream
    Dim fileContents As String
    
    Set fileSystemObj = New FileSystemObject
    Set textStreamObj = fileSystemObj.OpenTextFile("filePath1.csv")
    fileContents = textStreamObj.ReadAll
    textStreamObj.Close
    fileContents = Replace(fileContents, vbTab, ",")
    Set textStreamObj = fileSystemObj.CreateTextFile("filePath2.csv")
    textStreamObj.Write fileContents
    textStreamObj.Close
End Sub



回答4:


Using the ADODB.Stream object, you can create a user-defined function.

Sub all()

    Dim sourcepath As String
    Dim sDir As String
    Dim newpath As String
    Dim vResult As Variant
    Dim Wb As Workbook
    Dim Fn As String
     sourcepath = "C:\Users\PC\Desktop\Test\"
     newpath = sourcepath & "xlsx\"
     
     'make sure subfolder xlsx was created before
    
     sDir = Dir$(sourcepath & "*.csv", vbNormal)
     Application.ScreenUpdating = False
     Do Until Len(sDir) = 0
         'Workbooks.Open (sourcepath & sDir)
         'use adodb.stream
         vResult = TransToTextWithCsvUTF_8(sourcepath & sDir)
         Fn = Replace(sDir, ".csv", ".xlsx")
         
         Set Wb = Workbooks.Add
         With Wb
            Range("a1").Resize(UBound(vResult, 1) + 1, UBound(vResult, 2) + 1) = vResult
            .SaveAs Filename:=newpath & Fn, FileFormat:=xlOpenXMLWorkbook, CreateBackup:=False
            .Close
         End With
         
         sDir = Dir$
     Loop
     Application.ScreenUpdating = True
End Sub

Function TransToTextWithCsvUTF_8(strFn As String) As Variant
    Dim vR() As String
    Dim i As Long, r As Long, j As Integer, c As Integer
    Dim objStream  As Object
    Dim strRead As String
    Dim vSplit, vRow
    Dim s As String
    
    Set objStream = CreateObject("ADODB.Stream")

    With objStream
        .Charset = "utf-8"
        .Open
        .LoadFromFile strFn
         strRead = .ReadText
        .Close
    End With
    
    vSplit = Split(strRead, vbCrLf)
    r = UBound(vSplit)
    c = UBound(Split(vSplit(0), vbTab, , vbTextCompare))
    ReDim vR(0 To r, 0 To c)
    
    For i = 0 To r
        vRow = Split(vSplit(i), vbTab, , vbTextCompare)
        If UBound(vRow) = c Then 'if it is empty line, skip it
            For j = 0 To c
                If IsNumeric(vRow(j)) Then
                    s = Format(vRow(j), "#,##0.000")
                    s = Replace(s, ".", "+")
                    s = Replace(s, ",", ".")
                    s = Replace(s, "+", ",")
                    vR(i, j) = s
                Else
                    vR(i, j) = vRow(j)
                End If
            Next j
        End If
    Next i
    TransToTextWithCsvUTF_8 = vR
    
    Set objStream = Nothing

End Function



回答5:


The Text to Columns should work for this:

Sub all()

   Dim sourcepath As String
   Dim sDir As String
   Dim newpath As String
    
    sourcepath = "C:\Users\snapier\Downloads\Test\"
    newpath = sourcepath & "xlsx\"
    
    'make sure subfolder xlsx was created before

    sDir = Dir$(sourcepath & "*.csv", vbNormal)
    Do Until Len(sDir) = 0
        Workbooks.Open (sourcepath & sDir)
        With ActiveWorkbook.Worksheets(1)
            .UsedRange.TextToColumns Destination:=Range("A1"), DataType:=xlDelimited, _
            TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=True, Tab:=True, _
            Semicolon:=False, Comma:=False, Space:=False, Other:=False
        End With
        With ActiveWorkbook
            .SaveAs Filename:=Replace(Left(.FullName, InStrRev(.FullName, ".")), sourcepath, newpath) & "xlsx", FileFormat:=xlOpenXMLWorkbook, CreateBackup:=False
            .Close
        End With
        
        sDir = Dir$
    Loop
End Sub



回答6:


Replace the line Set wb = Workbooks.Open(Filename:=strDir & strFile, Delimiter:=Chr(9), Local:=False) with

Set wb = Workbooks.OpenText(FileName:=strDir & strFile, DataType:=xlDelimited, Tab:=True)

Make sure it's .txt so you can simply rename the .csv file before calling the "OpenText" method.

This forces the UTF-8 transfer of data: the "ä, ö and Ü", etc. will be moved to Excel as is.



来源:https://stackoverflow.com/questions/64783621/import-tab-separated-csv-tab-delimiter-not-recognized

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!