VBA : save a file with UTF-8 without BOM

前端 未结 2 1180
南旧
南旧 2020-11-29 07:11

it\'s probably sthg simple, here is what I tried :

 Set objStream = CreateObject(\"ADODB.Stream\")
 Set objStreamNoBOM = CreateObject(\"ADODB.Stream\")

 Wit         


        
相关标签:
2条回答
  • 2020-11-29 07:22

    In the best of all possible worlds the Related list would contain a reference to this question which I found as the first hit for "vbscript adodb.stream bom vbscript site:stackoverflow.com".

    Based on the second strategy from boost's answer:

    Option Explicit
    
    Const adSaveCreateNotExist = 1
    Const adSaveCreateOverWrite = 2
    Const adTypeBinary = 1
    Const adTypeText   = 2
    
    Dim objStreamUTF8      : Set objStreamUTF8      = CreateObject("ADODB.Stream")
    Dim objStreamUTF8NoBOM : Set objStreamUTF8NoBOM = CreateObject("ADODB.Stream")
    
    With objStreamUTF8
      .Charset = "UTF-8"
      .Open
      .WriteText "aÄö"
      .Position = 0
      .SaveToFile "toto.php", adSaveCreateOverWrite
      .Type     = adTypeText
      .Position = 3
    End With
    
    With objStreamUTF8NoBOM
      .Type    = adTypeBinary
      .Open
      objStreamUTF8.CopyTo objStreamUTF8NoBOM
      .SaveToFile "toto-nobom.php", adSaveCreateOverWrite
    End With
    
    objStreamUTF8.Close
    objStreamUTF8NoBOM.Close
    

    Evidence:

    chcp
    Active code page: 65001
    
    dir
     ...
    15.07.2015  18:48                 5 toto-nobom.php
    15.07.2015  18:48                 8 toto.php
    
    type toto-nobom.php
    aÄö
    
    0 讨论(0)
  • 2020-11-29 07:37

    I knew that the Scripting File System Object's stream inserted a Byte Order Mark, but I haven't seen that with the ADODB Stream.

    Or at least, not yet: I rarely use the ADODB stream object...

    But I do remember putting this remark into some code a few years ago:

    '   ****   WHY THIS IS COMMENTED OUT   **** **** **** **** **** **** **** ****
    '
    '   Microsoft ODBC and OLEDB database drivers cannot read the field names from
    '   the header when a unicode byte order mark (&HFF & &HFE) is inserted at the
    '   start of the text by Scripting.FileSystemObject 'Write' methods. Trying to
    '   work around this by writing byte arrays will fail; FSO 'Write' detects the
    '   string encoding automatically, and won't let you hack around it by writing
    '   the header as UTF-8 (or 'Narrow' string) and appending the rest as unicode
    '
    '   (Yes, I tried some revolting hacks to get around it: don't *ever* do that)
    '
    '   **** **** **** **** **** **** **** **** **** **** **** **** **** **** ****
    '
    '    With FSO.OpenTextFile(FilePath, ForWriting, True, TristateTrue)
    '        .Write Join(arrTemp1, EOROW)
    '        .Close
    '    End With ' textstream object from objFSO.OpenTextFile
    '
    '   **** **** **** **** **** **** **** **** **** **** **** **** **** **** ****
    

    You can tell I was having a bad day.

    Next, using prehistoric PUT commands from the days before file-handling had emerged from the primordial C:

    '   **** WHY WE 'PUT' A BYTE ARRAY INSTEAD OF A VBA STRING VARIABLE  **** ****
    '
    '       Put #hndFile, , StrConv(Join(arrTemp1, EOROW), vbUnicode)
    '       Put #hndFile, , Join(arrTemp1, EOROW)
    '
    '   If you pass unicode, Wide or UTF-16 string variables to PUT, it prepends a
    '   Unicode Byte Order Mark to the data which, when written to your file, will
    '   render the field names illegible to Microsoft's JET ODBC and ACE-OLEDB SQL
    '   drivers (which can actually read unicode field names, if the helpful label
    '   isn't in the way). However, the 'PUT' statements writes a Byte array as-is
    '
    '   **** **** **** **** **** **** **** **** **** **** **** **** **** **** ****
    

    So there's the code that actually does it:

    Dim arrByte() As Byte
    Dim strText   As String
    Dim hndFile   As String
    
    
        strText = "Y'all knew that strings are actually byte arrays?"
        arrByte = strText 
    
        hndFile = FreeFile
        Open FilePath For Binary As #hndFile
    
        Put #hndFile, , arrByte
        Close #hndFile
    
        Erase arrByte
    

    I'm assuming that strText is actually UTF-8. I mean, we're in VBA, in Microsoft Office, and we absolutely know that this is always going to be UTF-8, even we use it in a foreign country...

    ...Right?

    0 讨论(0)
提交回复
热议问题