How to “Validate” Human Names in CakePHP?

淺唱寂寞╮ 提交于 2019-11-26 23:07:19

I agree with the other comments that validating a name is probably a bad idea.

For virtually everything you can think of to validate, there will be someone with a name that breaks your rule. If you're happy with the idea that you're going to be blocking real people from entering their names, then you can validate it as much as you like. But the more validation rules you put in, the more likely you are to find a real person who can't sign in.

Here's a link to a page which describes some of the obvious (and not so obvious) things which people try to validate, which can trip them up:

http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/

If you want to allow anybody onto your site, then the best you can really hope for is to force a maximum field length to fit the space you've allocated in your database. Even then you're going to annoy someone.

There is no way to "validate". How can you prevent someone really called:

Robert'); DROP TABLE Students; --

http://xkcd.com/327/


EDIT: What I really mean is, people in some countries may have their name in different language (say Japanese, Chinese, Korean) and may even contains symbols. How will you think if a site says your name is "INVALID" when he/she is entering their real names?

Don't make any assumptions about how a name may pe spelled. Accept any input (yes, any), and do proper escaping when displaying it, so you don't get XSS vulnerabilities.

I'd suggest you do this escaping in the model on afterFind(), so you don't forget it somewhere. Keep the original data in a separate field of the model, like ['unescaped_name'], if you need to access the plain data.

deceze

Custom Regular Expression Validation

var $validate = array(
    'name' => array(
        'rule' => '/^[^%#\/*@!...other characters you don\'t want...]+$/',  
        'message' => 'Only letters and integers, min 3 characters'
    )
);

This is too naïve an approach though, as you would have to blacklist almost the entire range of Unicode characters. You can pretty much only do whitelisting of basic latin characters plus common quirks like spaces and apostrophes. Any more than that and you'll fight an uphill battle you can't win. You may be able to create a reasonably good algorithm over time, but it will never be 100% foolproof. So either restrict your users to basic latin names (and hope not to alienate your audience) or skip the validation entirely*.

* Or invest a few years into developing an algorithm covering <100% of human names, working 99.9% of the time.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!