URL Structure: Lower case VS Upper case

瘦欲@ 提交于 2019-11-28 23:16:20

The domain part is not case sensitive. GoOgLe.CoM works. You can add uppercase as you like, but normally there's not a reason to do so and, as stated in the comments below, may hurt your SEO ranking.

The path part is or is not case sensitive, depending on the server environment and server. Typically Windows machines are case insensitive, while Linux machines are case sensitive. This means that you should stick to lowercase or you risk introducing a bug that's really hard to hunt down (mismatched case that doesn't matter on the dev server).

The query string part is available to the server as it is. You can readily use mixed-case as you like, or discard the case (toLowerCase(...)). This also means that using a base64-encoded keys will work. You can't expect the users to type that correctly, though.

The hash part (called "fragment identifier") is only available to the client code, not to the server. Javascript may distinguish between the cases as it likes, and so does the browser. url#a will scroll to the element with the ID a, but url#A won't.

I'm going to have to disagree with all established wisdom on this, so I'll probably get downvoted, but:

If you redirect all mixed case urls to your properly cased url, it solves all the problems mentioned. Therefore it seems this argument is coming from tradition and preference. The point of a URL is to have a user-friendly representation of a page, and if your url is friendlier with upper case, why not use it? Compare:

moviesforyoutowatch.com/batman-vii-the-dark-knight-whatevers MoviesForYouToWatch.com/Batman-VII-The-Dark-Knight-Whatevers

I find the mixed case version superior for the purpose. If there's a technical reason that can't be solved with a lower-case compare and redirect, please share it.

I know you asked for technical reasons but it's also worth considering this from a UX perspective.

Say you have a URL with upper case characters and, for arguments sake, this has been distributed on printed media. When a user comes to enter that URL into their browser they may well be compelled to match that case (or be forced to match the specified case if your web server is case sensitive) ultimately you are giving them more work to do as they have to consider case as well. After all, they don't know if your server is case sensitive or not and they may have experienced 404s from case sensitive web servers in the past.

If your server is case sensitive and you are using mixed case URLs you are giving more scope for the user to mistype the URL. Furthermore, say you have the URL www.example.com/Contact. It's easy to confuse an upper and lower case "c" (especially if it is copied in hand writing) if the user overlooks this and uses the wrong case they may never reach your content.

With all this in mind consider www.example.com/News/Articles/FreeIceCreamForAll. On keyboard that's not too difficult but consider this on a mobile device, it would be very fiddly to input.

The reverse is also true should a user want to write down a URL from the address bar. They may feel they need to match the case, ultimately giving them more work to do and increasing the likelyhood of errors.

To conclude; keep URLs lower case.

Clive Williams

REGARDING SECURITY ASPECTS OF THIS ISSUE:

There is actually a good security reason to use a mix of uppercase and lowercase.

It has the effect of confusing and blocking attackers !

In human conversation humans get easily confused with uppercase and lowercase use.

Humans can't "speak" the word of the "identifiers or passwords or url's" with clarity if they contain uppercase and lowercase.

This helps with security on data or passwords on site sub-parts that are provided as part of a locked-in or secure sub-part of an "automated access" part of sites or their data.

It's similar to NOT USING JSON.

JSON is "human-readable text" and so JSON is simply giving all the attackers (Including Governments, Google .. who steal your ideas and data) ... almost everything they need to know about the data ... it's much more secure to confuse them by using private bespoke very-fast "binary protocols" - that use your own "unknowable data structures" ... but just watch out, because it is actually possible to confuse yourself or your own development team.

All your security layers and protocols have to be "well managed" to avoid confusion.

There is therefore an extra level of site and data security from human attackers (and some robots) to be had by simply using totally unconventional systems (i.e. why on earth would anybody want to use a "standard security protocol" when by some simple heavyweight prior computing they can all be easily broken).

Just "salt and hash" everything - plus also add some extra extra bespoke security of your own - it's just commonsense !

Conclusion: All the above answers are very clear and correct - but you can also happily leverage that very same knowledge to confuse potential attackers.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!