Split Strings into words with multiple word boundary delimiters

前端 未结 30 2640
既然无缘
既然无缘 2020-11-21 05:09

I think what I want to do is a fairly common task but I\'ve found no reference on the web. I have text with punctuation, and I want a list of the words.

\"H         


        
30条回答
  •  孤独总比滥情好
    2020-11-21 06:03

    Here is the answer with some explanation.

    st = "Hey, you - what are you doing here!?"
    
    # replace all the non alpha-numeric with space and then join.
    new_string = ''.join([x.replace(x, ' ') if not x.isalnum() else x for x in st])
    # output of new_string
    'Hey  you  what are you doing here  '
    
    # str.split() will remove all the empty string if separator is not provided
    new_list = new_string.split()
    
    # output of new_list
    ['Hey', 'you', 'what', 'are', 'you', 'doing', 'here']
    
    # we can join it to get a complete string without any non alpha-numeric character
    ' '.join(new_list)
    # output
    'Hey you what are you doing'
    

    or in one line, we can do like this:

    (''.join([x.replace(x, ' ') if not x.isalnum() else x for x in st])).split()
    
    # output
    ['Hey', 'you', 'what', 'are', 'you', 'doing', 'here']
    

    updated answer

提交回复
热议问题