Converting problem ANSI to UTF8 C#

前端 未结 7 1651
借酒劲吻你
借酒劲吻你 2021-01-17 20:08

I have a problem with converting a text file from ANSI to UTF8 in c#. I try to display the results in a browser.

So I have a this text file with many accent characte

相关标签:
7条回答
  • 2021-01-17 20:22

    This is probably happening because your original string text already contains invalid characters. Encoding conversion only makes sense if your input is a byte array. So, you should read the file as byte array instead of string, or, as Henk said, specify the encoding for reading the file.

    0 讨论(0)
  • 2021-01-17 20:22

    My thoughts here is when you save the file in Notepad++ it inserts the Byte-Order-Mark so the browser can infer that it's UTF8 from this. Otherwise you'd probably have to explicitly tell the browser the character encoding, as in the DTD, in XML etc.

    0 讨论(0)
  • 2021-01-17 20:35

    Do you have any idea why is this happening?

    Yes, you're too late. You need to specify ANSI when you read the string from file. In memory it's always Unicode (UTF16).

    0 讨论(0)
  • 2021-01-17 20:35

    When you convert to ASCII you immediately lose all non-English characters (including ones with accent) because ASCII has only 127 (7 bits) of characters.

    You do strange manipulation. string in .net is in UTF-16, so once you return string, not byte[] this doesn't matter.

    I think you should do: (I guess by ANSI you mean Latin1)

    public byte[] Encode(string text)
    {
        return Encoding.GetEncoding(1252).GetBytes(text);
    }
    

    Since the question was not very clear there is a reasonable remark that you might actually need this one:

    public string Decode(byte[] data)
    {
        return Encoding.GetEncoding(1252).GetString(data);
    }
    
    0 讨论(0)
  • 2021-01-17 20:39

    I would recommend to read this http://www.joelonsoftware.com/articles/Unicode.html.
    If you are going to read a ASCII file you need to know the code page of the file.

    0 讨论(0)
  • 2021-01-17 20:40

    Also, you can try the following thing. I've changed the type by using notepad+ in the file.
    (Encoding->Convert to UTF-8)
    It works for me.

    0 讨论(0)
提交回复
热议问题