Tokenize words in a list of sentences Python

后端未结

关注

 7  1471

广开言路 2021-02-04 06:41

i currently have a file that contains a list that is looks like

example = [\'Mary had a little lamb\' , 
           \'Jack went up the hill\' , 
           \'Ji


      
      
        
          7条回答        

        
                    
            
            
                         
                
              
              
                
                   抹茶落季
                                             
                
                
                (楼主)
            
              
              
                2021-02-04 07:24
              

            
            
                        
This also can be done by pytorch torchtext as
from torchtext.data import get_tokenizer

tokenizer = get_tokenizer('basic_english')
example = ['Mary had a little lamb' , 
            'Jack went up the hill' , 
            'Jill followed suit' ,    
            'i woke up suddenly' ,
            'it was a really bad dream...']
tokens = []
for s in example:
    tokens += tokenizer(s)
# ['mary', 'had', 'a', 'little', 'lamb', 'jack', 'went', 'up', 'the', 'hill', 'jill', 'followed', 'suit', 'i', 'woke', 'up', 'suddenly', 'it', 'was', 'a', 'really', 'bad', 'dream', '.', '.', '.']

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它7个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复