windows-1252

Python - dealing with mixed-encoding files

阅读更多关于 Python - dealing with mixed-encoding files

I have a file which is mostly UTF-8, but some Windows-1252 characters have also found their way in. I created a table to map from the Windows-1252 (cp1252) characters to their Unicode counterparts, and would like to use it to fix the mis-encoded characters, e.g. cp1252_to_unicode = { "\x85": u'\u2026', # … "\x91": u'\u2018', # ‘ "\x92": u'\u2019', # ’ "\x93": u'\u201c', # “ "\x94": u'\u201d', # ” "\x97": u'\u2014' # — } for l in open('file.txt'): for c, u in cp1252_to_unicode.items(): l = l.replace(c, u) But attempting to do the replace this way results in a UnicodeDecodeError being raised, e

Windows-1252 to UTF-8 encoding

阅读更多关于 Windows-1252 to UTF-8 encoding

I've copied certain files from a Windows machine to a Linux machine. So all the Windows encoded (windows-1252) files need to be converted to UTF-8. The files which are already in UTF-8 should not be changed. I'm planning to use the recode utility for that. How can I specify that the recode utility should only convert windows-1252 encoded files and not the UTF-8 files? Example usage of recode: recode windows-1252.. myfile.txt This would convert myfile.txt from windows-1252 to UTF-8. Before doing this, I would like to know that myfile.txt is actually windows-1252 encoded and not UTF-8 encoded.

How to read a file in Java with specific character encoding?

阅读更多关于 How to read a file in Java with specific character encoding?

问题 I am trying to read a file in as either UTF-8 or Windows-1252 depending on the output of this method: public Charset getCorrectCharsetToApply() { // Returns a Charset for either UTF-8 or Windows-1252. } So far, I have: String fileName = getFileNameToReadFromUserInput(); InputStream is = new ByteArrayInputStream(fileName.getBytes()); InputStreamReader isr = new InputStreamReader(is, getCorrectCharsetToApply()); BufferedReader buffReader = new BufferedReader(isr); The problem I'm having is

Python - dealing with mixed-encoding files

阅读更多关于 Python - dealing with mixed-encoding files

问题 I have a file which is mostly UTF-8, but some Windows-1252 characters have also found their way in. I created a table to map from the Windows-1252 (cp1252) characters to their Unicode counterparts, and would like to use it to fix the mis-encoded characters, e.g. cp1252_to_unicode = { "\x85": u'\u2026', # … "\x91": u'\u2018', # ‘ "\x92": u'\u2019', # ’ "\x93": u'\u201c', # “ "\x94": u'\u201d', # ” "\x97": u'\u2014' # — } for l in open('file.txt'): for c, u in cp1252_to_unicode.items(): l = l

.NET Core doesn't know about Windows 1252, how to fix?

阅读更多关于 .NET Core doesn't know about Windows 1252, how to fix?

This program works just fine when compiled for .NET 4 but does when compiled for .NET Core. I understand the error about encoding not supported but not how to fix it. Public Class Program Public Shared Function Main(ByVal args As String()) As Integer System.Text.Encoding.GetEncoding(1252) End Function End Class To do this, you need to register the CodePagesEncodingProvider instance from the System.Text.Encoding.CodePages package. To do that, install the System.Text.Encoding.CodePages package : dotnet add package System.Text.Encoding.CodePages Then (after implicitly or explicitly running dotnet

.NET Core doesn't know about Windows 1252, how to fix?

阅读更多关于 .NET Core doesn't know about Windows 1252, how to fix?

问题 This program works just fine when compiled for .NET 4 but does when compiled for .NET Core. I understand the error about encoding not supported but not how to fix it. Public Class Program Public Shared Function Main(ByVal args As String()) As Integer System.Text.Encoding.GetEncoding(1252) End Function End Class 回答1: To do this, you need to register the CodePagesEncodingProvider instance from the System.Text.Encoding.CodePages package. To do that, install the System.Text.Encoding.CodePages

Python - dealing with mixed-encoding files

Windows-1252 to UTF-8 encoding

How to read a file in Java with specific character encoding?

Python - dealing with mixed-encoding files

.NET Core doesn&#39;t know about Windows 1252, how to fix?

.NET Core doesn&#39;t know about Windows 1252, how to fix?

.NET Core doesn't know about Windows 1252, how to fix?

.NET Core doesn't know about Windows 1252, how to fix?