Why do some query strings work even if parameters are not URL-encoded?

前端 未结 3 958
盖世英雄少女心
盖世英雄少女心 2020-12-03 16:54

Here\'s an example:

https://drive.google.com/viewerng/viewer?embedded=true&url=http://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_

相关标签:
3条回答
  • 2020-12-03 17:05

    The reserved characters of an URI are mostly used as delimiters -- it doesn’t mean that they may not be used, it only means that they have a special purpose, and if you don’t need them for this purpose, you have to percent-encode them.

    The query component starts with the first ? and ends with the first # (if any, otherwise with the end of the URI). For the query component itself, there are no reserved characters defined.

    The URI standard RFC 3986 defines that the query component can contain these characters:

    • a-z, A-Z
    • 0-9
    • / ? : @ ! $ & ' ( ) * + , ; = - . _ ~
    • percent-encoded characters

    It even explicitly mentions:

    The characters slash ("/") and question mark ("?") may represent data within the query component.


    The query component of your example URI is this:

    embedded=true&url=http://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf
    

    Apart from letters, it contains =, &, :, /, ., ?, _, all of which are allowed in the query.

    Note that the name=value format (separated by &) in the query component is just a convention, not something defined in the specification.

    0 讨论(0)
  • 2020-12-03 17:06

    This is from the relevant RFC, 1738:

    https://www.ietf.org/rfc/rfc1738.txt

    3.3. HTTP

    The HTTP URL scheme is used to designate Internet resources
    accessible using HTTP (HyperText Transfer Protocol).

    The HTTP protocol is specified elsewhere. This specification only
    describes the syntax of HTTP URLs.

    An HTTP URL takes the form:

      http://<host>:<port>/<path>?<searchpart>
    

    where and are as described in Section 3.1. If : is omitted, the port defaults to 80. No user name or password is
    allowed. <path> is an HTTP selector, and <searchpart> is a query
    string. The <path> is optional, as is the <searchpart> and its
    preceding "?". If neither <path> nor<searchpart> is present, the "/" may also be omitted.

    Within the <path> and <searchpart> components, "/", ";", "?" are
    reserved. The "/" character may be used within HTTP to designate a
    hierarchical structure.

    The special characters in "http://" only apply to the "protocol" specification at the start of the URL. It's optional in most browsers (implicitly "http://").

    The first "?" separates the "path" from the "searchpart". Each "&" separates different arguments in the "searchpart".

    Your browser should differentiate between ?embedded=true and &url=http://www.pdf995.com/samples/pdf.pdf.

    'Hope that helps

    0 讨论(0)
  • 2020-12-03 17:16

    Because in a url some characters have special meanings, a question mark (?) is used to separate the path from the query, an ampersand (&) is used as a separator between key value pairs. So for characters like this, if we were to use them as a value in a query string the browser would get confused, we use encoding so that we can be sure that the data is not ambiguous. All these characters you have shown are not treated ambiguously as they are used in valid places according to the http URL schema.

    0 讨论(0)
提交回复
热议问题