Natural Language Processing: Find obscenities in English?

后端 未结 11 1248
自闭症患者
自闭症患者 2021-02-09 21:15

Given a set of words tagged for part of speech, I want to find those that are obscenities in mainstream English. How might I do this? Should I just make a huge list, and check f

11条回答
  •  生来不讨喜
    2021-02-09 22:02

    Note that any NLP logic like this will be subject to attacks of "character replacement":

    For example, I can write "hello" as "he11o", replacing L's with One's. Same with obscenities. So while there's no perfect answer, a "blacklist" approach of "bad words" might work. Watch out for false positives (I'd run my blacklist against a large book to see what comes up)

提交回复
热议问题