I have a script that I put together after going over many different ways that I could do an encoding conversion using ADODB in VBScript.
Option Explicit
Sub UTFConvert()
Dim objFSO, objStream, file
file = "FileToConvert.csv"
Set objStream = CreateObject( "ADODB.Stream" )
objStream.Open
objStream.Type = 2
objStream.Position = 0
objStream.Charset = "utf-8"
objStream.LoadFromFile file
objStream.SaveToFile file, 2
objStream.Close
Set objStream = Nothing
End Sub
UTFConvert
The file is supposed to be converted from UCS-2 Little Endian, or whichever readable format it is in (within limitations), to UTF-8. The issue however is that once this file has finished converting to UTF-8 there are many NUL
symbols throughout the entire file before and after every letter, and xFF
xFE
(UCS-2 LE BOM) at the start of the file. These are visible without needing to use any symbol visualization toggles. Any help would be appreciated in understanding where I may be limited with this conversion. Or any alternative approach I can take.
Your Stream
object is loading the file as an UTF-8 encoded file, thus misinterpreting the byte sequences. Read the file using a FileSystemObject
instance and write it with the ADODB.Stream
object:
Sub UTFConvert(filename)
Set fso = CreateObject("Scripting.FileSystemObject")
txt = fso.OpenTextFile(filename, 1, False, -1).ReadAll
Set stream = CreateObject("ADODB.Stream")
stream.Open
stream.Type = 2 'text
stream.Position = 0
stream.Charset = "utf-8"
stream.WriteText txt
stream.SaveToFile filename, 2
stream.Close
End Sub
来源:https://stackoverflow.com/questions/32343659/ucs-2-little-endian-to-utf-8-conversion-leaves-file-with-many-unwanted-character