Converting Unicode to Windows-1252 for vCards

前端 未结 1 1247
小蘑菇
小蘑菇 2020-12-11 06:35

I am trying to write a program in C# that will split a vCard (VCF) file with multiple contacts into individual files for each contact. I understand that the

1条回答
  •  囚心锁ツ
    2020-12-11 06:46

    You are correct in assuming that Windows-1252 supports the special characters you listed above (for a full list see the Wikipedia entry).

    using (var writer = new StreamWriter(destination, true, Encoding.GetEncoding(1252)))
    {
        writer.WriteLine(source);
    }
    

    In my test app using the code above it produced this result:

    Look at the cool letters I can make: å, æ, and ø!

    No question marks to be found. Are you setting the encoding when your reading it in with StreamReader?

    EDIT: You should just be able to use Encoding.Convert to convert the UTF-8 VCF file into Windows-1252. No need for Regex.Replace. Here is how I would do it:

    // You might want to think of a better method name.
    public string ConvertUTF8ToWin1252(string source)
    {
        Encoding utf8 = new UTF8Encoding();
        Encoding win1252 = Encoding.GetEncoding(1252);
    
        byte[] input = source.ToUTF8ByteArray();  // Note the use of my extension method
        byte[] output = Encoding.Convert(utf8, win1252, input);
    
        return win1252.GetString(output);
    }
    

    And here is how my extension method looks:

    public static class StringHelper
    {
        // It should be noted that this method is expecting UTF-8 input only,
        // so you probably should give it a more fitting name.
        public static byte[] ToUTF8ByteArray(this string str)
        {
            Encoding encoding = new UTF8Encoding();
            return encoding.GetBytes(str);
        }
    }
    

    Also you'll probably want to add usings to your ReadFile and WriteFile methods.

    0 讨论(0)
提交回复
热议问题