Characters allowed in a URL

前端 未结 9 936
没有蜡笔的小新
没有蜡笔的小新 2020-11-22 05:33

Does anyone know the full list of characters that can be used within a GET without being encoded? At the moment I am using A-Z a-z and 0-9... but I am looking to find out th

9条回答
  •  情话喂你
    2020-11-22 06:13

    From RFC 1738 specification:

    Thus, only alphanumerics, the special characters "$-_.+!*'(),", and reserved characters used for their reserved purposes may be used unencoded within a URL.

    EDIT: As @Jukka K. Korpela correctly points out, this RFC was updated by RFC 3986. This has expanded and clarified the characters valid for host, unfortunately it's not easily copied and pasted, but I'll do my best.

    In first matched order:

    host        = IP-literal / IPv4address / reg-name
    
    IP-literal  = "[" ( IPv6address / IPvFuture  ) "]"
    
    IPvFuture   = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" )
    
    IPv6address =         6( h16 ":" ) ls32
                      /                       "::" 5( h16 ":" ) ls32
                      / [               h16 ] "::" 4( h16 ":" ) ls32
                      / [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32
                      / [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32
                      / [ *3( h16 ":" ) h16 ] "::"    h16 ":"   ls32
                      / [ *4( h16 ":" ) h16 ] "::"              ls32
                      / [ *5( h16 ":" ) h16 ] "::"              h16
                      / [ *6( h16 ":" ) h16 ] "::"
    
    ls32        = ( h16 ":" h16 ) / IPv4address
                      ; least-significant 32 bits of address
    
    h16         = 1*4HEXDIG 
                   ; 16 bits of address represented in hexadecimal
    
    IPv4address = dec-octet "." dec-octet "." dec-octet "." dec-octet
    
    dec-octet   = DIGIT                 ; 0-9
                  / %x31-39 DIGIT         ; 10-99
                  / "1" 2DIGIT            ; 100-199
                  / "2" %x30-34 DIGIT     ; 200-249
                  / "25" %x30-35          ; 250-255
    
    reg-name    = *( unreserved / pct-encoded / sub-delims )
    
    unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"     <---This seems like a practical shortcut, most closely resembling original answer
    
    reserved    = gen-delims / sub-delims
    
    gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"
    
    sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
                  / "*" / "+" / "," / ";" / "="
    
    pct-encoded = "%" HEXDIG HEXDIG
    

提交回复
热议问题