How to test if a string contains gibberish in PHP?

前端 未结 2 1065
萌比男神i
萌比男神i 2021-01-02 10:38

I am making a registering form for a website and because I\'m sure everyone is going to enter some gibberish in the Secret Answer\'s input (I do that myself), I would like t

2条回答
  •  别那么骄傲
    2021-01-02 11:28

    Though the following technique would not be very accurate, in my opinion, you could gather a set of english rules and compare against them.

    The only way I see around this is (and by all means not a perfect solution) a system that raises certain flags when something is suspicious.

    English (and every language) has certain particularities and if you see that they are not met, this could be and indication of gibberish.

    I "would" make a system that adds points when a criteria is met, and after some points, the user would get a warning.

    Some examples:

    Consecutive consonants:

    • 3 consonants in a row => 5 points (many exeptions like "chrome")
    • 4 consonants in a row => 15 points
    • 5 consonants in a row => 30 points
    • 6 consonants in a row => 60 points (can't think of many words)

    (The points system is of course an example!)

    The same would apply with consecutive vowels.

    Special characters:

    This should add some points if they appear and are not native to the site's language.

    (|#¢∞¬÷“≠) etc

    As they could be typos because of gibberish.

    Length:

    After a certain length, every letter should start adding points. A 28 character word is not as probable as a 7 letter one.

    This are the ones in the top of my mind of course this is not an exact (or even good) science.

    Also you could try some "Common gibberish":

    You could search for combinations like:

    qwerty asdfg zxcvb uiop or what ever.

    Of course this last one is completely random and will probably cover very few cases, but you could do as much as you want.

    So this is the best I could come up with now, I'm sure there are lot more English grammar rules and particularities that you could use to your advantage, but this will only give you a certain probability, so I wouldn't make your rules compulsory by all means!! in any case something like a warning: "If you put a completely random answer you might not remember it later".

    This is a very complex subject, but very interesting question IMO, my answer of course only covers a very little part that would avoid some answers like: "ksfjdngssjk", although it would be very easy to set up a system like this with PHP and try it out.

    Good luck!!

提交回复
热议问题