Fast punctuation removal with pandas

前端 未结 3 1002
予麋鹿
予麋鹿 2020-11-22 06:21

This is a self-answered post. Below I outline a common problem in the NLP domain and propose a few performant methods to solve it.

Oftentimes the need arises to remo

3条回答
  •  终归单人心
    2020-11-22 06:57

    Interesting enough that vectorized Series.str.translate method is still slightly slower compared to Vanilla Python str.translate():

    def pd_translate(df):
        return df.assign(text=df['text'].str.translate(transtab))
    

提交回复
热议问题