FileSystemObject - Reading Unicode Files

前端 未结 5 1689
闹比i
闹比i 2021-01-04 09:52

Classic ASP, VBScript context.

A lot of articles including this Microsoft one, say you cannot use FileSystemObject to read Unicode files.

I

相关标签:
5条回答
  • 2021-01-04 09:58

    I think MS does not officially state that it supports unicode because:

    1. It does not detect unicode files using the byte-order mark at the start of the file, and
    2. It only supports Little-Endian UTF-16 unicode files (and you need to remove the byte order mark if present).

    Here is some sample code that I have been using successfully (for a few years) to auto-detect and read unicode files with FSO (assuming they are little-endian and contain the BOM):

    'Detect Unicode Files
    Set Stream = FSO.OpenTextFile(ScriptFolderObject.Path & "\" & FileName, 1, False)
    intAsc1Chr = Asc(Stream.Read(1))
    intAsc2Chr = Asc(Stream.Read(1))
    Stream.Close
    If intAsc1Chr = 255 And intAsc2Chr = 254 Then 
        OpenAsUnicode = True
    Else
        OpenAsUnicode = False
    End If
    
    'Get script content
    Set Stream = FSO.OpenTextFile(ScriptFolderObject.Path & "\" & FileName, 1, 0, OpenAsUnicode)
    TextContent = Stream.ReadAll()
    Stream.Close
    
    0 讨论(0)
  • 2021-01-04 09:59

    Yes that documentation is out of date. The scripting component did go through a set of changes in its early days (some of them were breaking changes if you were using early binding) however since at least WK2000 SP4 and XP SP2 it has been very stable.

    Just be careful what you mean by unicode. Sometimes the word unicode is used more broadly and can cover any encoding of unicode. FSO does not read for example UTF8 encodings of unicode. For that you would need to fall back on ADODB.Stream.

    0 讨论(0)
  • 2021-01-04 10:03

    I am writing a windows 7 gadget and run in to the same problem, and if it is possible you can just switch your files into another encoding, for example: ANSI encoding "windows-1251". With this encoding it is working fine.

    If you are using this to writing a site, then better will be to use another development approach avoiding this objects.

    0 讨论(0)
  • 2021-01-04 10:12
    'assume we have detected that it is Unicode file - then very straightforward 
    'byte-by-byte crawling sorted out my problem:
    '.
    '.
    '.
    else
       eilute=f.ReadAll
       'response.write("&#268;IA BUVO &#268;ARLIS<br/>")
       'response.write(len(eilute))
       'response.write("<br/>")
       elt=""
       smbl=""
       for i=3 to len(eilute)  'First 2 bytes are 255 and 254
         baitas=asc(mid(eilute,i,1)) 
         if (i+1) <= len(eilute) then
          i=i+1 
        else
         exit for
        end if
        antras=asc(mid(eilute,i,1))*256 ' raidems uzteks
        'response.write(baitas)
        'response.write(asc(mid(eilute,i,1)))
        'response.write("<br/>")
        if baitas=13 and antras=0 then 'LineFeed
          response.write(elt)
          response.write("<br/>")
          elt=""
          if (i+2) <= len(eilute) then i=i+2 'persokam per CarriageReturn
        else
          skaicius=antras+baitas
          smbl="&#" & skaicius & ";"
          elt=elt & smbl
        end if
        next
       if elt<>"" then
        response.write(elt)
        response.write("<br/>")
        elt=""
       end if
      end if
     f.Close
     '.
     '.
    
    0 讨论(0)
  • 2021-01-04 10:21

    I'd say if it works, use it ;-)

    I notice the MS article you refer to is from the Windows 2000 (!) scripting guide. Maybe it's obsolete.

    0 讨论(0)
提交回复
热议问题