Match digits on a string with certain conditions in python

后端 未结 2 1737
情深已故
情深已故 2021-01-20 18:56

I have a sequence of strings in the form

s1 = \"Schblaum 12324 tunguska 24 234n\"
s2 = \"jacarta 331 matchika 22 234k\"
s3 = \"3239 thingolee 80394 234k\"


        
相关标签:
2条回答
  • 2021-01-20 19:18

    Use re.findall

    >>> import re
    >>> s1 = "Schblaum 12324 tunguska 24 234n"
    >>> re.findall(r'^\S+\D*\d+|\S.*', s1)
    ['Schblaum 12324', 'tunguska 24 234n']
    >>> s2 = "jacarta 331 matchika 22 234k"
    >>> s3 = "3239 thingolee 80394 234k"
    >>> re.findall(r'^\S+\D*\d+|\S.*', s2)
    ['jacarta 331', 'matchika 22 234k']
    >>> re.findall(r'^\S+\D*\d+|\S.*', s3)
    ['3239 thingolee 80394', '234k']
    
    0 讨论(0)
  • 2021-01-20 19:25

    Even without regex, all you're doing is looking for the number and splitting after it. Try:

    s = "Schblaum 12324 tunguska 24 234n"
    words = s.split()
    for idx, word in enumerate(words[1:], start=1):  # skip the first element
        if word.isdigit():
            break
    before, after = ' '.join(words[:idx+1]), \
                    ' '.join(words[idx+1:])
    

    You could also use re.split to find spaces that lookbehind and see a digit, but you'll have to process afterwards since it'll split after the first one as well.

    import re
    
    s3 = "3239 thingolee 80394 234k"
    result = re.split(r"(?<=\d)\s", s3, 2)  # split at most twice
    if len(result) > 2:
        before = ' '.join(result[:2])
    else:
        before = result[0]
    after = result[-1]
    
    0 讨论(0)
提交回复
热议问题