How do you implement a good profanity filter?

后端未结

关注

 21  2345

误落风尘 2020-11-22 04:27

Many of us need to deal with user input, search queries, and situations where the input text can potentially contain profanity or undesirable language. Oftentimes this needs

21条回答

别那么骄傲 (楼主)

2020-11-22 05:08
Also late in the game, but doing some researches and stumbled across here. As others have mentioned, it's just almost close to impossible if it was automated, but if your design/requirement can involve in some cases (but not all the time) human interactions to review whether it is profane or not, you may consider ML. https://docs.microsoft.com/en-us/azure/cognitive-services/content-moderator/text-moderation-api#profanity is my current choice right now for multiple reasons:
- Supports many localization
- They keep updating the database, so I don't have to keep up with latest slangs or languages (maintenance issue)
- When there is a high probability (I.e. 90% or more) you can just deny it pragmatically
- You can observe for category which causes a flag that may or may not be profanity, and can have somebody review it to teach that it is or isn't profane.
For my need, it was/is based on public-friendly commercial service (OK, videogames) which other users may/will see the username, but the design requires that it has to go through profanity filter to reject offensive username. The sad part about this is the classic "clbuttic" issue will most likely occur since usernames are usually single word (up to N characters) of sometimes multiple words concatenated... Again, Microsoft's cognitive service will not flag "Assist" as Text.HasProfanity=true but may flag one of the categories probability to be high.

As the OP inquires, what about "a$$", here's a result when I passed it through the filter:, as you can see, it has determined it's not profane, but it has high probability that it is, so flags as recommendations of reviewing (human interactions).

When probability is high, I can either return back "I'm sorry, that name is already taken" (even if it isn't) so that it is less offensive to anti-censorship persons or something, if we don't want to integrate human review, or return "Your username have been notified to the live operation department, you may wait for your username to be reviewed and approved or chose another username". Or whatever...

By the way, the cost/price for this service is quite low for my purpose (how often does the username gets changed?), but again, for OP maybe the design demands more intensive queries and may not be ideal to pay/subscribe for ML-services, or cannot have human-review/interactions. It all depends on the design... But if design does fit the bill, perhaps this can be OP's solution.

If interested, I can list the cons in the comment in the future.
0 讨论(0)

查看其它21个回答
发布评论:

提交评论
- 加载中...