How to split a string into a list?

后端 未结 9 2456
执念已碎
执念已碎 2020-11-21 04:32

I want my Python function to split a sentence (input) and store each word in a list. My current code splits the sentence, but does not store the words as a list. How do I do

相关标签:
9条回答
  • 2020-11-21 05:05

    If you want all the chars of a word/sentence in a list, do this:

    print(list("word"))
    #  ['w', 'o', 'r', 'd']
    
    
    print(list("some sentence"))
    #  ['s', 'o', 'm', 'e', ' ', 's', 'e', 'n', 't', 'e', 'n', 'c', 'e']
    
    0 讨论(0)
  • 2020-11-21 05:07

    I think you are confused because of a typo.

    Replace print(words) with print(word) inside your loop to have every word printed on a different line

    0 讨论(0)
  • 2020-11-21 05:14

    shlex has a .split() function. It differs from str.split() in that it does not preserve quotes and treats a quoted phrase as a single word:

    >>> import shlex
    >>> shlex.split("sudo echo 'foo && bar'")
    ['sudo', 'echo', 'foo && bar']
    
    0 讨论(0)
  • 2020-11-21 05:15

    How about this algorithm? Split text on whitespace, then trim punctuation. This carefully removes punctuation from the edge of words, without harming apostrophes inside words such as we're.

    >>> text
    "'Oh, you can't help that,' said the Cat: 'we're all mad here. I'm mad. You're mad.'"
    
    >>> text.split()
    ["'Oh,", 'you', "can't", 'help', "that,'", 'said', 'the', 'Cat:', "'we're", 'all', 'mad', 'here.', "I'm", 'mad.', "You're", "mad.'"]
    
    >>> import string
    >>> [word.strip(string.punctuation) for word in text.split()]
    ['Oh', 'you', "can't", 'help', 'that', 'said', 'the', 'Cat', "we're", 'all', 'mad', 'here', "I'm", 'mad', "You're", 'mad']
    
    0 讨论(0)
  • 2020-11-21 05:20

    Depending on what you plan to do with your sentence-as-a-list, you may want to look at the Natural Language Took Kit. It deals heavily with text processing and evaluation. You can also use it to solve your problem:

    import nltk
    words = nltk.word_tokenize(raw_sentence)
    

    This has the added benefit of splitting out punctuation.

    Example:

    >>> import nltk
    >>> s = "The fox's foot grazed the sleeping dog, waking it."
    >>> words = nltk.word_tokenize(s)
    >>> words
    ['The', 'fox', "'s", 'foot', 'grazed', 'the', 'sleeping', 'dog', ',', 
    'waking', 'it', '.']
    

    This allows you to filter out any punctuation you don't want and use only words.

    Please note that the other solutions using string.split() are better if you don't plan on doing any complex manipulation of the sentence.

    [Edited]

    0 讨论(0)
  • 2020-11-21 05:23

    Splits the string in text on any consecutive runs of whitespace.

    words = text.split()      
    

    Split the string in text on delimiter: ",".

    words = text.split(",")   
    

    The words variable will be a list and contain the words from text split on the delimiter.

    0 讨论(0)
提交回复
热议问题