How do you know what encoding the user is inputing into the browser?

前端 未结 3 1724
南旧
南旧 2021-01-26 11:52

I read Joel\'s article about character sets and so I\'m taking his advice to use UTF-8 on my web page and in my database. What I can\'t understand is what to do with user input

3条回答
  •  南笙
    南笙 (楼主)
    2021-01-26 12:15

    Don't try to detect, convert all user-inputed text to UTF-8 in your application. You can do all you can on your side, by configuring your webserver to send UTF-8 pages and UTF-8 headers, configure your application to handle all text in UTF-8, tweak your filesystem (if necessary) to handle text files as UTF-8, configure your database, but you simply have no real control on the user end. You can suggest the proper character encoding in your html forms, like the following, but it's not really enforceable on the user end:

    Unless detecting the encoding of the user input is the whole purpose of your application, it's a fools errand to try. Assume the encoding is wrong and convert it to UTF-8 in your app. Just as you should assume your user input is malicious and clean it up before you attempt to insert it into your database.

    In most languages that have UTF-8 properly implemented, ASCII characters will survive conversion, so don't worry about that either.

提交回复
热议问题