问题
If use in url, non allowed character, for example space:
<a href="pa ge.php">link</a>
and click this link, in browser addres bar I see mysite.com/pa%20ge
okay, and if now I use georgian, (or for example russian) alphabet symbols:
<a href="აბცდ.php">link</a>
In in browser addres bar, I see mysite/აბცდ.php
that is, these non latine alphabet symbols, are not changed, tey are in url "presented" as original view.
question: Why? non latine alphabet symbols are also allowed in url ?
回答1:
No, a URL can only contain (a subset of) ASCII.
The browser is converting "აბცდ" into percentage-encoded entities for the actual URL that is sent to the server. In fact, you should be embedding it as percentage encoded string into your document to begin with, the browser is just covering that mistake for you.
What the browser shows in the address bar is something different. Modern browsers try to be as user friendly as possible and decode some percentage encoded characters to show in the address bar as human readable text. For anti-spoofing reasons, only some are decoded, not all. Georgian happens to be pretty safe, since it's hard to mistake it for any other similar looking characters.
回答2:
Those characters are internally percent encoded as well, but the browser displays them in their original format as a courtesy to the user. When you copy & paste the URL, you will see the percent encoding is in place:
http://domain.com/mysite.აბცდ.php
becomes
http://domain.com/mysite.%E1%83%90%E1%83%91%E1%83%AA%E1%83%93.php
See this answer for background information.
来源:https://stackoverflow.com/questions/13540092/non-latin-symbols-in-url-php