Valid characters for directory part of a URL (for short links)

后端 未结 2 1068
悲&欢浪女
悲&欢浪女 2020-11-29 00:03

Are there any other characters except A-Za-z0-9 that can be used to shorten links without getting into trouble? :)

I was thinking about +,;- or something.

Is

相关标签:
2条回答
  • 2020-11-29 00:39

    According to RFC 3986 the valid characters for the path component are:

    a-z A-Z 0-9 . - _ ~ ! $ & ' ( ) * + , ; = : @
    

    as well as percent-encoded characters and of course, the slash /.

    Keep in mind, though, that many applications (not necessarily browsers) that attempt to parse URIs to make them clickable, for example, may support a much smaller set of characters. This is akin to parsing e-mail addresses where most attempts also don't catch all addresses allowed by the standard.

    0 讨论(0)
  • 2020-11-29 00:48

    A path segment (the parts in a path separated by /) in an absolute URI path can contain zero or more of pchar that is defined as follows:

      pchar       = unreserved / pct-encoded / sub-delims / ":" / "@"
      pct-encoded = "%" HEXDIG HEXDIG
      unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
      sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
                  / "*" / "+" / "," / ";" / "="
    

    So it’s basically AZ, az, 09, -, ., _, ~, !, $, &, ', (, ), *, +, ,, ;, =, :, @, as well as % that must be followed by two hexadecimal digits. Any other character/byte needs to be encoded using the percent-encoding.

    Although these are 79 characters in total that can be used in a path segment literally, some user agents do encode some of these characters as well (e.g. %7E instead of ~). That’s why many use just the 62 alphanumeric characters (i.e. AZ, az, 09) or the Base 64 Encoding with URL and Filename Safe Alphabet (i.e. AZ, az, 09, -, _).

    0 讨论(0)
提交回复
热议问题