Python Regex replace

前端 未结 6 1321
野性不改
野性不改 2021-02-05 05:01

Hey I\'m trying to figure out a regular expression to do the following.

Here is my string

Place,08/09/2010,\"15,531\",\"2,909\",650

I

相关标签:
6条回答
  • 2021-02-05 05:27

    If you need a regex solution, this should do:

    r"(\d+),(?=\d\d\d)"
    

    then replace with:

    "\1"
    

    It will replace any comma-delimited numbers anywhere in your string with their number-only equivalent, thus turning this:

    Place,08/09/2010,"15,531","548,122,909",650

    into this:

    Place,08/09/2010,"15531","548122909",650

    I'm sure there are a few holes to be found and places you don't want this done, and that's why you should use a parser!

    Good luck!

    0 讨论(0)
  • 2021-02-05 05:31

    Another way of doing it using regex directly:

    >>> import re
    >>> data = "Place,08/09/2010,\"15,531\",\"2,909\",650"
    >>> res = re.findall(r"(\w+),(\d{2}/\d{2}/\d{4}),\"([\d,]+)\",\"([\d,]+)\",(\d+)", data)
    >>> res
    [('Place', '08/09/2010', '15,531', '2,909', '650')]
    
    0 讨论(0)
  • 2021-02-05 05:38
    new_string = re.sub(r'"(\d+),(\d+)"', r'\1.\2', original_string)
    

    This will substitute the , inside the quotes with a . and you can now just use the strings split method.

    0 讨论(0)
  • 2021-02-05 05:41

    You could parse a string of that format using pyparsing:

    import pyparsing as pp
    import datetime as dt
    
    st='Place,08/09/2010,"15,531","2,909",650'
    
    def line_grammar():
        integer=pp.Word(pp.nums).setParseAction(lambda s,l,t: [int(t[0])])
        sep=pp.Suppress('/')
        date=(integer+sep+integer+sep+integer).setParseAction(
                  lambda s,l,t: dt.date(t[2],t[1],t[0]))
        comma=pp.Suppress(',')
        quoted=pp.Regex(r'("|\').*?\1').setParseAction(
                  lambda s,l,t: [int(e) for e in t[0].strip('\'"').split(',')])
        line=pp.Word(pp.alphas)+comma+date+comma+quoted+comma+quoted+comma+integer
        return line
    
    line=line_grammar()
    print(line.parseString(st))
    # ['Place', datetime.date(2010, 9, 8), 15, 531, 2, 909, 650]
    

    The advantage is you parse, convert, and validate in a few lines. Note that the ints are all converted to ints and the date to a datetime structure.

    0 讨论(0)
  • 2021-02-05 05:46
    >>> from StringIO import StringIO
    >>> import csv
    >>> r = csv.reader(StringIO('Place,08/09/2010,"15,531","2,909",650'))
    >>> r.next()
    ['Place', '08/09/2010', '15,531', '2,909', '650']
    
    0 讨论(0)
  • 2021-02-05 05:52
    a = """Place,08/09/2010,"15,531","2,909",650""".split(',')
    result = []
    i=0
    while i<len(a):
        if not "\"" in a[i]:
            result.append(a[i])
        else:
            string = a[i]
            i+=1
            while True:
                string += ","+a[i]
                if "\"" in a[i]:
                    break
                i+=1
            result.append(string)
        i+=1
    print result
    

    Result:
    ['Place', '08/09/2010', '"15,531"', '"2,909"', '650']
    Not a big fan of regular expressions unless you absolutely need them

    0 讨论(0)
提交回复
热议问题