Python - regex - Splitting string before word

前端 未结 4 2028
悲&欢浪女
悲&欢浪女 2021-01-22 22:21

I am trying to split a string in python before a specific word. For example, I would like to split the following string before \"path:\".

  • split strin
相关标签:
4条回答
  • 2021-01-22 23:01
    in_str = "path:bte00250 Alanine, aspartate and glutamate metabolism path:bte00330 Arginine and proline metabolism"
    in_list = in_str.split('path:')
    print ",path:".join(in_list)[1:]
    
    0 讨论(0)
  • 2021-01-22 23:10

    You could do ["path:"+s for s in line.split("path:")[1:]] instead of using a regex. (note that we skip first match, that has no "path:" prefix.

    0 讨论(0)
  • 2021-01-22 23:11

    using a regular expression to split your string seems a bit overkill: the string split() method may be just what you need.

    anyway, if you really need to match a regular expression in order to split your string, you should use the re.split() method, which splits a string upon a regular expression match.

    also, use a correct regular expression for splitting:

    >>> line = 'path:bte00250 Alanine, aspartate and glutamate metabolism path:bte00330 Arginine and proline metabolism'
    >>> re.split(' (?=path:)', line)
    ['path:bte00250 Alanine, aspartate and glutamate metabolism', 'path:bte00330 Arginine and proline metabolism']
    

    the (?=...) group is a lookahead assertion: the expression matches a space (note the space at the start of the expression) which is followed by the string 'path:', without consuming what follows the space.

    0 讨论(0)
  • 2021-01-22 23:16

    This can be done without regular expressons. Given a string:

    s = "path:bte00250 Alanine, aspartate ... path:bte00330 Arginine and ..."
    

    We can temporarily replace the desired word with a placeholder. The placeholder is a single character, which we use to split by:

    word, placeholder = "path:", "|"
    s = s.replace(word, placeholder).split(placeholder)
    s
    # ['', 'bte00250 Alanine, aspartate ... ', 'bte00330 Arginine and ...']
    

    Now that the string is split, we can rejoin the original word to each sub-string using a list comprehension:

    ["".join([word, i]) for i in s if i]
    # ['path:bte00250 Alanine, aspartate ... ', 'path:bte00330 Arginine and ...']
    
    0 讨论(0)
提交回复
热议问题