How do you implement a good profanity filter?

后端未结

关注

 21  2289

误落风尘 2020-11-22 04:27

Many of us need to deal with user input, search queries, and situations where the input text can potentially contain profanity or undesirable language. Oftentimes this needs

21条回答

北海茫月 (楼主)

2020-11-22 05:09
Whilst I know that this question is fairly old, but it's a commonly occurring question...

There is both a reason and a distinct need for profanity filters (see Wikipedia entry here), but they often fall short of being 100% accurate for very distinct reasons; Context and accuracy.

It depends (wholly) on what you're trying to achieve - at it's most basic, you're probably trying to cover the "seven dirty words" and then some... Some businesses need to filter the most basic of profanity: basic swear words, URLs or even personal information and so on, but others need to prevent illicit account naming (Xbox live is an example) or far more...

User generated content doesn't just contain potential swear words, it can also contain offensive references to:
- Sexual acts
- Sexual orientation
- Religion
- Ethnicity
- Etc...
And potentially, in multiple languages. Shutterstock has developed basic dirty-words lists in 10 languages to date, but it's still basic and very much oriented towards their 'tagging' needs. There are a number of other lists available on the web.

I agree with the accepted answer that it's not a defined science and as language is a continually evolving challenge but one where a 90% catch rate is better than 0%. It depends purely on your goals - what you're trying to achieve, the level of support you have and how important it is to remove profanities of different types.

In building a filter, you need to consider the following elements and how they relate to your project:
- Words/phrases
- Acronyms (FOAD/LMFAO etc)
- False positives (words, places and names like 'mishit', 'scunthorpe' and 'titsworth')
- URLs (porn sites are an obvious target)
- Personal information (email, address, phone etc - if applicable)
- Language choice (usually English by default)
- Moderation (how, if at all, you can interact with user generated content and what you can do with it)
You can easily build a profanity filter that captures 90%+ of profanities, but you'll never hit 100%. It's just not possible. The closer you want to get to 100%, the harder it becomes... Having built a complex profanity engine in the past that dealt with more than 500K realtime messages per day, I'd offer the following advice:

A basic filter would involve:
- Building a list of applicable profanities
- Developing a method of dealing with derivations of profanities
A moderately complex filer would involve, (In addition to a basic filter):
- Using complex pattern matching to deal with extended derivations (using advanced regex)
- Dealing with Leetspeak (l33t)
- Dealing with false positives
A complex filter would involve a number of the following (In addition to a moderate filter):
- Whitelists and blacklists
- Naive bayesian inference filtering of phrases/terms
- Soundex functions (where a word sounds like another)
- Levenshtein distance
- Stemming
- Human moderators to help guide a filtering engine to learn by example or where matches aren't accurate enough without guidance (a self/continually-improving system)
- Perhaps some form of AI engine
0 讨论(0)

查看其它21个回答
发布评论:

提交评论
- 加载中...