How do you implement a good profanity filter?

后端 未结 21 2330
误落风尘
误落风尘 2020-11-22 04:27

Many of us need to deal with user input, search queries, and situations where the input text can potentially contain profanity or undesirable language. Oftentimes this needs

相关标签:
21条回答
  • 2020-11-22 05:05

    I'm a little late to the party, but I have a solution that might work for some who read this. It's in javascript instead of php, but there's a valid reason for it.

    Full disclosure, I wrote this plugin...

    Anyways.

    The approach I've gone with is to allow a user to "Opt-In" to their profanity filtering. Basically profanity will be allowed by default, but if my users don't want to read it, they don't have to. This also helps with the "l33t sp3@k" issue.

    The concept is a simple jquery plugin that gets injected by the server if the client's account is enabling profanity filtering. From there, it's just a couple simple lines that blot out the swears.

    Here's the demo page
    https://chaseflorell.github.io/jQuery.ProfanityFilter/demo/

    <div id="foo">
        ass will fail but password will not
    </div>
    
    <script>
        // code:
        $('#foo').profanityFilter({
            customSwears: ['ass']
        });
    </script>
    

    result

    *** will fail but password will not

    0 讨论(0)
  • 2020-11-22 05:06

    Don't.

    Because:

    • Clbuttic
    • Profanity is not OMG EVIL
    • Profanity cannot be effectively defined
    • Most people quite probably don't appreciate being "protected" from profanity

    Edit: While I agree with the commenter who said "censorship is wrong", that is not the nature of this answer.

    0 讨论(0)
  • 2020-11-22 05:08

    Also late in the game, but doing some researches and stumbled across here. As others have mentioned, it's just almost close to impossible if it was automated, but if your design/requirement can involve in some cases (but not all the time) human interactions to review whether it is profane or not, you may consider ML. https://docs.microsoft.com/en-us/azure/cognitive-services/content-moderator/text-moderation-api#profanity is my current choice right now for multiple reasons:

    • Supports many localization
    • They keep updating the database, so I don't have to keep up with latest slangs or languages (maintenance issue)
    • When there is a high probability (I.e. 90% or more) you can just deny it pragmatically
    • You can observe for category which causes a flag that may or may not be profanity, and can have somebody review it to teach that it is or isn't profane.

    For my need, it was/is based on public-friendly commercial service (OK, videogames) which other users may/will see the username, but the design requires that it has to go through profanity filter to reject offensive username. The sad part about this is the classic "clbuttic" issue will most likely occur since usernames are usually single word (up to N characters) of sometimes multiple words concatenated... Again, Microsoft's cognitive service will not flag "Assist" as Text.HasProfanity=true but may flag one of the categories probability to be high.

    As the OP inquires, what about "a$$", here's a result when I passed it through the filter:, as you can see, it has determined it's not profane, but it has high probability that it is, so flags as recommendations of reviewing (human interactions).

    When probability is high, I can either return back "I'm sorry, that name is already taken" (even if it isn't) so that it is less offensive to anti-censorship persons or something, if we don't want to integrate human review, or return "Your username have been notified to the live operation department, you may wait for your username to be reviewed and approved or chose another username". Or whatever...

    By the way, the cost/price for this service is quite low for my purpose (how often does the username gets changed?), but again, for OP maybe the design demands more intensive queries and may not be ideal to pay/subscribe for ML-services, or cannot have human-review/interactions. It all depends on the design... But if design does fit the bill, perhaps this can be OP's solution.

    If interested, I can list the cons in the comment in the future.

    0 讨论(0)
  • 2020-11-22 05:09

    Whilst I know that this question is fairly old, but it's a commonly occurring question...

    There is both a reason and a distinct need for profanity filters (see Wikipedia entry here), but they often fall short of being 100% accurate for very distinct reasons; Context and accuracy.

    It depends (wholly) on what you're trying to achieve - at it's most basic, you're probably trying to cover the "seven dirty words" and then some... Some businesses need to filter the most basic of profanity: basic swear words, URLs or even personal information and so on, but others need to prevent illicit account naming (Xbox live is an example) or far more...

    User generated content doesn't just contain potential swear words, it can also contain offensive references to:

    • Sexual acts
    • Sexual orientation
    • Religion
    • Ethnicity
    • Etc...

    And potentially, in multiple languages. Shutterstock has developed basic dirty-words lists in 10 languages to date, but it's still basic and very much oriented towards their 'tagging' needs. There are a number of other lists available on the web.

    I agree with the accepted answer that it's not a defined science and as language is a continually evolving challenge but one where a 90% catch rate is better than 0%. It depends purely on your goals - what you're trying to achieve, the level of support you have and how important it is to remove profanities of different types.

    In building a filter, you need to consider the following elements and how they relate to your project:

    • Words/phrases
    • Acronyms (FOAD/LMFAO etc)
    • False positives (words, places and names like 'mishit', 'scunthorpe' and 'titsworth')
    • URLs (porn sites are an obvious target)
    • Personal information (email, address, phone etc - if applicable)
    • Language choice (usually English by default)
    • Moderation (how, if at all, you can interact with user generated content and what you can do with it)

    You can easily build a profanity filter that captures 90%+ of profanities, but you'll never hit 100%. It's just not possible. The closer you want to get to 100%, the harder it becomes... Having built a complex profanity engine in the past that dealt with more than 500K realtime messages per day, I'd offer the following advice:

    A basic filter would involve:

    • Building a list of applicable profanities
    • Developing a method of dealing with derivations of profanities

    A moderately complex filer would involve, (In addition to a basic filter):

    • Using complex pattern matching to deal with extended derivations (using advanced regex)
    • Dealing with Leetspeak (l33t)
    • Dealing with false positives

    A complex filter would involve a number of the following (In addition to a moderate filter):

    • Whitelists and blacklists
    • Naive bayesian inference filtering of phrases/terms
    • Soundex functions (where a word sounds like another)
    • Levenshtein distance
    • Stemming
    • Human moderators to help guide a filtering engine to learn by example or where matches aren't accurate enough without guidance (a self/continually-improving system)
    • Perhaps some form of AI engine
    0 讨论(0)
  • 2020-11-22 05:10

    If you can do something like Digg/Stackoverflow where the users can downvote/mark obscene content... do so.

    Then all you need to do is review the "naughty" users, and block them if they break the rules.

    0 讨论(0)
  • 2020-11-22 05:10

    Profanity filters are a bad idea. The reason is that you can't catch every swear word. If you try, you get false-positives.

    Catching Words

    Let's just say you want to catch the F-Word. Easy, right? Well let's see.

    You can loop through a string to find "fuck." Unfortunately, people trick filters nowadays. The profanity filter didn't pick up "fuk."

    One can try to check for multiple spellings and variants of the word, but that will slow down your code's performance. To catch the F-Word, you need to look for "fuc", "Fuc", "fuk", "Fuk", "F***", etc. And the list goes on and on.

    Avoiding Innocence

    Okay, so how about make it case-insensitive and ignore spaces so it catches "F u C k"? That might sound like a good idea, but someone can just bypass the profanity filter with "F.U.C.K."

    You ignore punctuation.

    Now that is a real problem, since a sentence like "Hello, there!" will pick up as "hell," and "Whassup?" picks up as "ass."

    And there're a bunch of words that you have to exclude from the filter, such as "Constitution," because there's "tit" in it.

    People can also use substitute words, such as "Frack." You block that too? What about "pen is" for "penis"? Your program doesn't have artificial intelligence to know whether the string is good or bad.

    Don't use profanity filters. They're hard to develop, and they're as slow as a crawl.

    0 讨论(0)
提交回复
热议问题