python reg ex to include missing commas

后端 未结 2 1697
刺人心
刺人心 2021-01-15 13:22

I need to ensure an string to have comma separated values. The strings I read may have space separated values.

  • Some commas might be missing in my input string
相关标签:
2条回答
  • 2021-01-15 13:28

    PyParsing will definitely not be the fastest way to run this, but it is perhaps the fastest way to write it ;-)

    from pyparsing import *
    
    STRING = sglQuotedString | dblQuotedString
    NONSTRING = Word(alphanums + '.-')
    line = OneOrMore(STRING | NONSTRING | Suppress(',')) + lineEnd
    
    
    def insert_commas(s):
        values = line.parseString(s).asList()
        return ", ".join(values)
    
    
    s1 = """1, ' unchanged 1' " unchanged  2 "  2, 2"""
    s2 = """1, ' unchanged 1', " unchanged 2 " ,  2, 2"""
    s3 = """ 1, ' unchanged 1' " unchanged 2 " 2, 2 45"""
    s4 = """1, 67.90e-34 67.90E-34 7.9093339333 2, 2 """
    
    print insert_commas(s1)
    print insert_commas(s2)
    print insert_commas(s3)
    print insert_commas(s4)
    

    which prints

    1, ' unchanged 1', " unchanged  2 ", 2, 2
    1, ' unchanged 1', " unchanged 2 ", 2, 2
    1, ' unchanged 1', " unchanged 2 ", 2, 2, 45
    1, 67.90e-34, 67.90E-34, 7.9093339333, 2, 2
    
    0 讨论(0)
  • 2021-01-15 13:47

    Maybe it would be easier use findall, str.join and str.strip, finding the strings between quotes first then all non-whitespace:

    s = """ 1, ' unchanged 1' " unchanged  2 "  2.009, -2e15 3"""
    
    r = re.compile("[\'\"].*?[\'\"]|\S+")
    print(", ".join([x.strip(",") for x in r.findall(s)]))
    
    1, ' unchanged 1', " unchanged  2 ", 2.009, -2e11, ' unchanged 1', " unchanged  2 ", 2.009, -2e15, 35, 3
    

    If you don't want any space after the comma:

    print(",".join([x.strip(",") for x in r.findall(s)]))
    1,' unchanged 1'," unchanged  2 ",2.009,-2e15,3
    
    0 讨论(0)
提交回复
热议问题