Natural Language Processing: Find obscenities in English?

后端 未结 11 1267
自闭症患者
自闭症患者 2021-02-09 21:15

Given a set of words tagged for part of speech, I want to find those that are obscenities in mainstream English. How might I do this? Should I just make a huge list, and check f

11条回答
  •  执念已碎
    2021-02-09 21:56

    At Melissa Data, when my manager , the director of Massachusetts Research and Development and I refactored a Data Profiler targeted at Relational Databases , we counted profanities by the number of Levinshtein Distance matches where the number of insertions, deletions and substitutions is tunable by the user so as to allow for spelling mistakes, Germanic equivalents of English language, plurals, as well as whitespace and non-whitespace punctuation. We speeded up the running time of the Levinshtein Distance calculation by looking only in the diagonal bands of the n by n matrix.

提交回复
热议问题