I have a script, VBS or Ruby, that saves a Word document as \'Filtered HTML\', but the encoding parameter is ignored. The HTML file is always encoded in Windows-1252. I\'m u
Word can't do this as far as I know.
However, you could add the following lines to the end of your Ruby script
text_as_utf8 = File.read('C:\whatever.html').encode('UTF-8')
File.open('C:\whatever.html','wb') {|f| f.print text_as_utf8}
If you have an older version of Ruby, you may need to use Iconv
. If you have special characters in 'C:\whatever.html'
, you'll want to look into your invalid/undefined replacement options.
You'll also probably want to update the charset in the HTML meta
tag:
text_as_utf8.gsub!('charset=windows-1252', 'charset=UTF-8')
before you write to the file.