Tokenize words in a list of sentences Python

后端 未结 7 1458
广开言路
广开言路 2021-02-04 06:41

i currently have a file that contains a list that is looks like

example = [\'Mary had a little lamb\' , 
           \'Jack went up the hill\' , 
           \'Ji         


        
7条回答
  •  北荒
    北荒 (楼主)
    2021-02-04 07:06

    In Spacy it will be as simple as :

    import spacy
    
    example = ['Mary had a little lamb' , 
               'Jack went up the hill' , 
               'Jill followed suit' ,    
               'i woke up suddenly' ,
               'it was a really bad dream...']
    
    nlp = spacy.load("en_core_web_sm")
    
    result = []
    
    for line in example:
        sent = nlp(line)
        token_result = []
        for token in sent:
            token_result.append(token)
        result.append(token_result)
    
    print(result)
    

    And the output will be :

    [[Mary, had, a, little, lamb], [Jack, went, up, the, hill], [Jill, followed, suit], [i, woke, up, suddenly], [it, was, a, really, bad, dream, ...]]
    

提交回复
热议问题