Tokenizing words into a new column in a pandas dataframe

前端未结

关注

 2  1513

I am trying to go through a list of comments collected on a pandas dataframe and tokenize those words and put those words in a new column in the dataframe but I have having an e

相关标签:

2条回答

春和景丽

2021-01-28 01:27
Don't you just want to do this:
```
   df['words'] = df['complaint'].apply(apwords)
```
you don't need to define the function addwords. Which should be defined as:
```
addwords = lambda x: apwords(x)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
半阙折子戏

2021-01-28 01:37
Your way to apply the lambda function is correct, it is the way you define addwords that doesn't work.

When you define apwords you define a function not an attribute therefore when you want to apply it, use:
```
addwords = lambda x: apwords(x)
```
And not:
```
addwords = lambda x: x.apwords()
```
If you want to use apwords as an attribute, you would need to define a class that inheritates from string and define apwords as an attribute in this class.

It is far easier to stay with the function:
```
def apwords(words):
    filtered_sentence = []
    words = word_tokenize(words)
    for w in words:
        filtered_sentence.append(w)
    return filtered_sentence
addwords = lambda x: apwords(x)
df['words'] = df['complaint'].apply(addwords)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...