Word Document.SaveAs ignores encoding, when calling through OLE, from Ruby or VBS

后端 未结 3 1442
挽巷
挽巷 2021-01-14 14:25

I have a script, VBS or Ruby, that saves a Word document as \'Filtered HTML\', but the encoding parameter is ignored. The HTML file is always encoded in Windows-1252. I\'m u

3条回答
  •  伪装坚强ぢ
    2021-01-14 14:58

    Word can't do this as far as I know.

    However, you could add the following lines to the end of your Ruby script

    text_as_utf8 = File.read('C:\whatever.html').encode('UTF-8')
    File.open('C:\whatever.html','wb') {|f| f.print text_as_utf8}
    

    If you have an older version of Ruby, you may need to use Iconv. If you have special characters in 'C:\whatever.html', you'll want to look into your invalid/undefined replacement options.

    You'll also probably want to update the charset in the HTML meta tag:

    text_as_utf8.gsub!('charset=windows-1252', 'charset=UTF-8')
    

    before you write to the file.

提交回复
热议问题