I\'m trying to convert some strings that are in French Canadian and basically, I\'d like to be able to take out the French accent marks in the letters while keeping the lett
What this person said:
Encoding.ASCII.GetString(Encoding.GetEncoding(1251).GetBytes(text));
It actually splits the likes of å
which is one character (which is character code 00E5
, not 0061
plus the modifier 030A
which would look the same) into a
plus some kind of modifier, and then the ASCII conversion removes the modifier, leaving the only a
.
The CodePage of Greek (ISO) can do it
The information about this codepage is into System.Text.Encoding.GetEncodings()
. Learn about in: https://msdn.microsoft.com/pt-br/library/system.text.encodinginfo.getencoding(v=vs.110).aspx
Greek (ISO) has codepage 28597 and name iso-8859-7.
Go to the code... \o/
string text = "Você está numa situação lamentável";
string textEncode = System.Web.HttpUtility.UrlEncode(text, Encoding.GetEncoding("iso-8859-7"));
//result: "Voce+esta+numa+situacao+lamentavel"
string textDecode = System.Web.HttpUtility.UrlDecode(textEncode);
//result: "Voce esta numa situacao lamentavel"
So, write this function...
public string RemoveAcentuation(string text)
{
return
System.Web.HttpUtility.UrlDecode(
System.Web.HttpUtility.UrlEncode(
text, Encoding.GetEncoding("iso-8859-7")));
}
Note that... Encoding.GetEncoding("iso-8859-7")
is equivalent to Encoding.GetEncoding(28597)
because first is the name, and second the codepage of Encoding.