The origin on why ' ' is used as a space in URLs

后端 未结 3 1953
悲&欢浪女
悲&欢浪女 2020-12-09 07:44

I am interested in knowing why \'%20\' is used as a space in URLs, particularly why %20 was used and why we even need it in the first place.

相关标签:
3条回答
  • 2020-12-09 08:01

    It's called percent encoding. Some characters can't be in a URI (for example #, as it denotes the URL fragment), so they are represented with characters that can be (# becomes %23)

    Here's an excerpt from that same article:

    When a character from the reserved set (a "reserved character") has special meaning (a "reserved purpose") in a certain context, and a URI scheme says that it is necessary to use that character for some other purpose, then the character must be percent-encoded. Percent-encoding a reserved character involves converting the character to its corresponding byte value in ASCII and then representing that value as a pair of hexadecimal digits. The digits, preceded by a percent sign ("%") which is used as an escape character, are then used in the URI in place of the reserved character. (For a non-ASCII character, it is typically converted to its byte sequence in UTF-8, and then each byte value is represented as above.)

    The space character's character code is 32:

    > ' '.charCodeAt(0)
    32
    

    Which is 20 in base-16:

    > ' '.charCodeAt(0).toString(16)
    "20"
    

    Tack a percent sign in front of it and you get %20.

    0 讨论(0)
  • 2020-12-09 08:01

    It uses percent encoding. You can see the Percent Encoding part of the RFC for Uniform Resource Identifier (URI): Generic Syntax

    A percent-encoding mechanism is used to represent a data octet in a component when that octet's corresponding character is outside the
    allowed set or is being used as a delimiter of, or within, the
    component. A percent-encoded octet is encoded as a character
    triplet, consisting of the percent character "%" followed by the two
    hexadecimal digits representing that octet's numeric value. For
    example, "%20" is the percent-encoding for the binary octet
    "00100000" (ABNF: %x20), which in US-ASCII corresponds to the space
    character (SP).

    0 讨论(0)
  • 2020-12-09 08:06

    Because URLs have strict syntactic rules, like / being a special path separator character, spaces not being allowed in a URL and all characters having to be a certain subset of ASCII. To embed arbitrary characters in URLs regardless of these restrictions, bytes can be percent encoded. The byte x20 represents a space in the ASCII encoding (and most other encodings), hence %20 is the URL-encoded version of it.

    0 讨论(0)
提交回复
热议问题