How to properly split a CSV using C# split() function?

前端 未结 6 859
旧时难觅i
旧时难觅i 2020-12-05 19:21

Suppose I have this CSV file :

NAME,ADDRESS,DATE
\"Eko S. Wibowo\", \"Tamanan, Banguntapan, Bantul, DIY\", \"6/27/1979\"

I would like like

相关标签:
6条回答
  • 2020-12-05 19:58

    I've done this with my own method. It simply counts the amout of " and ' characters.
    Improve this to your needs.

        public List<string> SplitCsvLine(string s) {
            int i;
            int a = 0;
            int count = 0;
            List<string> str = new List<string>();
            for (i = 0; i < s.Length; i++) {
                switch (s[i]) {
                    case ',':
                        if ((count & 1) == 0) {
                            str.Add(s.Substring(a, i - a));
                            a = i + 1;
                        }
                        break;
                    case '"':
                    case '\'': count++; break;
                }
            }
            str.Add(s.Substring(a));
            return str;
        }
    
    0 讨论(0)
  • 2020-12-05 19:58

    It's not an exact answer to your question, but why don't you use already written library to manipulate CSV file, good example would be LinqToCsv. CSV could be delimited with various punctuation signs. Moreover, there are gotchas, which are already addressed by library creators. Such as dealing with name row, dealing with different date formats and mapping rows to C# objects.

    0 讨论(0)
  • 2020-12-05 20:01

    You could use regex too:

    string input = "\"Eko S. Wibowo\", \"Tamanan, Banguntapan, Bantul, DIY\", \"6/27/1979\"";
    string pattern = @"""\s*,\s*""";
    
    // input.Substring(1, input.Length - 2) removes the first and last " from the string
    string[] tokens = System.Text.RegularExpressions.Regex.Split(
        input.Substring(1, input.Length - 2), pattern);
    

    This will give you:

    Eko S. Wibowo
    Tamanan, Banguntapan, Bantul, DIY
    6/27/1979
    
    0 讨论(0)
  • 2020-12-05 20:04

    You can replace "," with ; then split by ;

    var values= s.Replace("\",\"",";").Split(';');
    
    0 讨论(0)
  • 2020-12-05 20:09

    Five years old but there is always somebody new who wants to split a CSV.

    If your data is simple and predictable (i.e. never has any special characters like commas, quotes and newlines) then you can do it with split() or regex.

    But to support all the nuances of the CSV format properly without code soup you should really use a library where all the magic has already been figured out. Don't re-invent the wheel (unless you are doing it for fun of course).

    CsvHelper is simple enough to use:

    https://joshclose.github.io/CsvHelper/2.x/

    using (var parser = new CsvParser(textReader)
    {
        while(true)
        {
            string[] line = parser.Read();
    
            if (line != null)
            {
                // do something
            }
            else
            {
                break;
            }
        }
    }
    

    More discussion / same question: Dealing with commas in a CSV file

    0 讨论(0)
  • 2020-12-05 20:14

    If your CSV line is tightly packed it's easiest to use the end and tail removal mentioned earlier and then a simple split on a joining string

     string[] tokens = input.Substring(1, input.Length - 2).Split("\",\"");
    

    This will only work if ALL fields are double-quoted even if they don't (officially) need to be. It will be faster than RegEx but with given conditions as to its use.

    Really useful if your data looks like "Name","1","12/03/2018","Add1,Add2,Add3","other stuff"

    0 讨论(0)
提交回复
热议问题