Tokenizing words into a new column in a pandas dataframe

前端 未结 2 1513
轻奢々
轻奢々 2021-01-28 00:45

I am trying to go through a list of comments collected on a pandas dataframe and tokenize those words and put those words in a new column in the dataframe but I have having an e

相关标签:
2条回答
  • 2021-01-28 01:27

    Don't you just want to do this:

       df['words'] = df['complaint'].apply(apwords)
    

    you don't need to define the function addwords. Which should be defined as:

    addwords = lambda x: apwords(x)
    
    0 讨论(0)
  • 2021-01-28 01:37

    Your way to apply the lambda function is correct, it is the way you define addwords that doesn't work.

    When you define apwords you define a function not an attribute therefore when you want to apply it, use:

    addwords = lambda x: apwords(x)
    

    And not:

    addwords = lambda x: x.apwords()
    

    If you want to use apwords as an attribute, you would need to define a class that inheritates from string and define apwords as an attribute in this class.

    It is far easier to stay with the function:

    def apwords(words):
        filtered_sentence = []
        words = word_tokenize(words)
        for w in words:
            filtered_sentence.append(w)
        return filtered_sentence
    addwords = lambda x: apwords(x)
    df['words'] = df['complaint'].apply(addwords)
    
    0 讨论(0)
提交回复
热议问题