What are the legal and illegal characters in URL/Link?

后端 未结 2 1699
盖世英雄少女心
盖世英雄少女心 2021-01-14 07:02

What happens if there is a illegal character? Does the URL fix it self by encoding the illegal characters into something else?

相关标签:
2条回答
  • 2021-01-14 07:46

    As explained here

    ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~:/?#[]@!$&'()*+,;=. Any other character needs to be encoded with the percent-encoding (%hh). Each part of the URI has further restrictions about what characters need to be represented by an percent-encoded word.

    0 讨论(0)
  • 2021-01-14 07:52

    Allowed characters

    RFC 3986 defines which characters are allowed in which URI components.

    RFCs for specific URI schemes might further restrict this.

    If you are interested in HTTP/HTTPS URIs: they are defined in RFC 7230. AFAIK they don’t have further restrictions regarding allowed characters, so you could stick to the definitions in RFC 3986.

    What happens if illegal characters are used?

    Depends on many factors … could be anything from "nothing happens" to "doesn’t work anymore".

    Does the URL fix it self by encoding the illegal characters into something else?

    A URI can’t fix itself, it’s just a string.

    Clients working with this URI (browser, server, email client, etc.) may try to fix a URI (or work with invalid URIs) according to their own rules.

    URI vs. link

    Also note that there’s a difference between a URI and linking to (or storing etc.) this URI in a document.
    The host language (e.g., HTML) might have rules what to encode. This does not change the URI, only the way the URI is stored/specified in this document.

    For example, the valid URI http://example.com/a&b would have to be linked like this in HTML documents:

    <a href="http://example.com/a&amp;b">Link</a>
    

    But the URI is still http://example.com/a&b, not http://example.com/a&amp;b.

    0 讨论(0)
提交回复
热议问题