What is a quick way to force CRLF in C# / .NET?

前端未结

关注

 6  1498

独厮守ぢ

How would you normalize all new-line sequences in a string to one type?

I\'m looking to make them all CRLF for the purpose of email (MIME documents). Ideally this w

相关标签:

6条回答

情书的邮戳

2020-12-08 13:49

It depends on exactly what the requirements are. In particular, how do you want to handle "\r" on its own? Should that count as a line break or not? As an example, how should "a\n\rb" be treated? Is that one very odd line break, one "\n" break and then a rogue "\r", or two separate linebreaks? If "\r" and "\n" can both be linebreaks on their own, why should "\r\n" not be treated as two linebreaks?

Here's some code which I suspect is reasonably efficient.

using System;
using System.Text;

class LineBreaks
{    
    static void Main()
    {
        Test("a\nb");
        Test("a\nb\r\nc");
        Test("a\r\nb\r\nc");
        Test("a\rb\nc");
        Test("a\r");
        Test("a\n");
        Test("a\r\n");
    }

    static void Test(string input)
    {
        string normalized = NormalizeLineBreaks(input);
        string debug = normalized.Replace("\r", "\\r")
                                 .Replace("\n", "\\n");
        Console.WriteLine(debug);
    }

    static string NormalizeLineBreaks(string input)
    {
        // Allow 10% as a rough guess of how much the string may grow.
        // If we're wrong we'll either waste space or have extra copies -
        // it will still work
        StringBuilder builder = new StringBuilder((int) (input.Length * 1.1));

        bool lastWasCR = false;

        foreach (char c in input)
        {
            if (lastWasCR)
            {
                lastWasCR = false;
                if (c == '\n')
                {
                    continue; // Already written \r\n
                }
            }
            switch (c)
            {
                case '\r':
                    builder.Append("\r\n");
                    lastWasCR = true;
                    break;
                case '\n':
                    builder.Append("\r\n");
                    break;
                default:
                    builder.Append(c);
                    break;
            }
        }
        return builder.ToString();
    }
}

0 讨论(0)

感情败类

2020-12-08 13:54
```
input.Replace("\r\n", "\n").Replace("\r", "\n").Replace("\n", "\r\n")
```
This will work if the input contains only one type of line breaks - either CR, or LF, or CR+LF.
0 讨论(0)
发布评论:

提交评论
- 加载中...
甜味超标

2020-12-08 14:03

Environment.NewLine;

A string containing "\r\n" for non-Unix platforms, or a string containing "\n" for Unix platforms.

0 讨论(0)
发布评论:

提交评论
- 加载中...

谎友^

2020-12-08 14:04

Simple variant:

Regex.Replace(input, @"\r\n|\r|\n", "\r\n")

For better performance:

static Regex newline_pattern = new Regex(@"\r\n|\r|\n", RegexOptions.Compiled);
[...]
    newline_pattern.Replace(input, "\r\n");

0 讨论(0)

一生所求

2020-12-08 14:04

This is a quick way to do that, I mean.

It does not use an expensive regex function. It also does not use multiple replacement functions that each individually did loop over the data with several checks, allocations, etc.

So the search is done directly in one for loop. For the number of times that the capacity of the result array has to be increased, a loop is also used within the Array.Copy function. That are all the loops. In some cases, a larger page size might be more efficient.

public static string NormalizeNewLine(this string val)
{
    if (string.IsNullOrEmpty(val))
        return val;

    const int page = 6;
    int a = page;
    int j = 0;
    int len = val.Length;
    char[] res = new char[len];

    for (int i = 0; i < len; i++)
    {
        char ch = val[i];

        if (ch == '\r')
        {
            int ni = i + 1;
            if (ni < len && val[ni] == '\n')
            {
                res[j++] = '\r';
                res[j++] = '\n';
                i++;
            }
            else
            {
                if (a == page) // Ensure capacity
                {
                    char[] nres = new char[res.Length + page];
                    Array.Copy(res, 0, nres, 0, res.Length);
                    res = nres;
                    a = 0;
                }

                res[j++] = '\r';
                res[j++] = '\n';
                a++;
            }
        }
        else if (ch == '\n')
        {
            int ni = i + 1;
            if (ni < len && val[ni] == '\r')
            {
                res[j++] = '\r';
                res[j++] = '\n';
                i++;
            }
            else
            {
                if (a == page) // Ensure capacity
                {
                    char[] nres = new char[res.Length + page];
                    Array.Copy(res, 0, nres, 0, res.Length);
                    res = nres;
                    a = 0;
                }

                res[j++] = '\r';
                res[j++] = '\n';
                a++;
            }
        }
        else
        {
            res[j++] = ch;
        }
    }

    return new string(res, 0, j);
}

I now that '\n\r' is not actually used on basic platforms. But who would use two types of linebreaks in succession to indicate two linebreaks?

If you want to know that, then you need to take a look before to know if the \n and \r both are used separately in the same document.

0 讨论(0)

庸人自扰

2020-12-08 14:08

string nonNormalized = "\r\n\n\r";

string normalized = nonNormalized.Replace("\r", "\n").Replace("\n", "\r\n");

0 讨论(0)