Text Language detection in python

后端 未结 1 1783
一整个雨季
一整个雨季 2021-01-07 00:32

I am trying to detect the language of the text that may consist of an unknown number of languages. The following code gives me different languages as answer NOTE: I

相关标签:
1条回答
  • 2021-01-07 01:01

    In your loop you're overwriting the entire column by doing this:

    df['Languagereveiw'] = lang
    

    If you want to do this in a for loop use iteritems:

    for index, row in df['Review'].iteritems():
        lang = detect(row) #detecting each row
        df.loc[index, 'Languagereveiw'] = lang
    

    however, you can just ditch the loop and just do

    df['Languagereveiw'] = df['Review'].apply(detect)
    

    Which is syntactic sugar to execute your func on the entire column

    Regarding your latter question about converting from language code to full description:

    'en' to 'english',

    look at polyglot

    this provides the facility to detect language, get the language code, and the full description

    0 讨论(0)
提交回复
热议问题