Converting problem ANSI to UTF8 C#

前端未结

关注

 7  1664

借酒劲吻你

I have a problem with converting a text file from ANSI to UTF8 in c#. I try to display the results in a browser.

So I have a this text file with many accent characte

相关标签:

7条回答

没有蜡笔的小新

2021-01-17 20:22

This is probably happening because your original string text already contains invalid characters. Encoding conversion only makes sense if your input is a byte array. So, you should read the file as byte array instead of string, or, as Henk said, specify the encoding for reading the file.

0 讨论(0)
发布评论:

提交评论
- 加载中...
我在风中等你

2021-01-17 20:22

My thoughts here is when you save the file in Notepad++ it inserts the Byte-Order-Mark so the browser can infer that it's UTF8 from this. Otherwise you'd probably have to explicitly tell the browser the character encoding, as in the DTD, in XML etc.

0 讨论(0)
发布评论:

提交评论
- 加载中...
旧巷少年郎

2021-01-17 20:35

Do you have any idea why is this happening?

Yes, you're too late. You need to specify ANSI when you read the string from file. In memory it's always Unicode (UTF16).

0 讨论(0)
发布评论:

提交评论
- 加载中...
旧巷少年郎

2021-01-17 20:35
When you convert to ASCII you immediately lose all non-English characters (including ones with accent) because ASCII has only 127 (7 bits) of characters.

You do strange manipulation. string in .net is in UTF-16, so once you return string, not byte[] this doesn't matter.

I think you should do: (I guess by ANSI you mean Latin1)
```
public byte[] Encode(string text)
{
    return Encoding.GetEncoding(1252).GetBytes(text);
}
```
Since the question was not very clear there is a reasonable remark that you might actually need this one:
```
public string Decode(byte[] data)
{
    return Encoding.GetEncoding(1252).GetString(data);
}
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
野趣味

2021-01-17 20:39

I would recommend to read this http://www.joelonsoftware.com/articles/Unicode.html.
If you are going to read a ASCII file you need to know the code page of the file.

0 讨论(0)
发布评论:

提交评论
- 加载中...
栀梦

2021-01-17 20:40

Also, you can try the following thing. I've changed the type by using notepad+ in the file.
(Encoding->Convert to UTF-8)
It works for me.

0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页