How to use autocorrect in Pandas column of sentences

冷暖自知 提交于 2020-05-09 06:01:05

问题


I have a column of sentences, which i am splitting like so

df['ColTest'] = df['ColTest'].str.lower().str.split()

What i am trying to do is loop through each word in each sentence and apply the autocorrect.spell()

for i in df['ColTest']:
for j in i:
    df['ColTest'][i][j].replace(at.spell(j))

This is throwing up an error

AttributeError: 'float' object has no attribute 'replace'

Autospell autospell

DataFrame looks like

ColTest
This is some test string
that might contain a finger
but this string might contain a toe
and this hass a spel error

There are no numbers in my column...any ideas please?


回答1:


Using the autocorrect library, you need to iterate through the rows of the dataframe then iterate through the words within a given row to apply the spell method. Here's a working example:

from autocorrect import spell 
import pandas as pd 

df = pd.DataFrame(["and this hass a spel error"], columns=["colTest"])
df.colTest.apply(lambda x: " ".join([spell(i) for i in x.split()]))

Also as suggested by @jpp in the comment below, we can avoid using lambdaas follows:

df["colTest"] = [' '.join([spell(i) for i in x.split()]) for x in df['colTest']]

Here's how the input looks like:

                      colTest
0  and this hass a spel error

Output:

0    and this has a spell error
Name: colTest, dtype: object


来源:https://stackoverflow.com/questions/49364664/how-to-use-autocorrect-in-pandas-column-of-sentences

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!