Why does this regex result in four items?

后端 未结 3 1937
粉色の甜心
粉色の甜心 2021-01-24 03:44

I want to split a string by , ->, =>, or those wrapped with several spaces, meaning that I can get two items, she and <

3条回答
  •  故里飘歌
    2021-01-24 04:30

    Each time you are using parentheses "()" you are creating a capturing group. A capturing group is a part of a match. A match always refers to the complete regex string. That is why you are getting 4 results.

    Documentation says: "If capturing parentheses are used in pattern, then the text of all groups in the pattern are also returned as part of the resulting list."

    You could try making the groups "non-capturing" as Rawing suggested. Do this by simply prepending "?:" inside the parentheses you do not want to be captured.

    I would just leave out the parentheses altogether:

    res = re.compile("\\s*[-=]>\\s*|\\s*").split(' she  -> he \n')
    res = filter(None, res)
    res = list(res)
    

    Output:

    ['she', 'he']
    

提交回复
热议问题