Java URL encoding of query string parameters

前端 未结 12 1047
清歌不尽
清歌不尽 2020-11-21 05:27

Say I have a URL

http://example.com/query?q=

and I have a query entered by the user such as:

random word £500 bank

12条回答
  •  时光取名叫无心
    2020-11-21 06:08

    URL url= new URL("http://example.com/query?q=random word £500 bank $");
    URI uri = new URI(url.getProtocol(), url.getUserInfo(), IDN.toASCII(url.getHost()), url.getPort(), url.getPath(), url.getQuery(), url.getRef());
    String correctEncodedURL=uri.toASCIIString(); 
    System.out.println(correctEncodedURL);
    

    Prints

    http://example.com/query?q=random%20word%20%C2%A3500%20bank%20$
    

    What is happening here?

    1. Split URL into structural parts. Use java.net.URL for it.

    2. Encode each structural part properly!

    3. Use IDN.toASCII(putDomainNameHere) to Punycode encode the host name!

    4. Use java.net.URI.toASCIIString() to percent-encode, NFC encoded unicode - (better would be NFKC!). For more info see: How to encode properly this URL

    In some cases it is advisable to check if the url is already encoded. Also replace '+' encoded spaces with '%20' encoded spaces.

    Here are some examples that will also work properly

    {
          "in" : "http://نامه‌ای.com/",
         "out" : "http://xn--mgba3gch31f.com/"
    },{
         "in" : "http://www.example.com/‥/foo",
         "out" : "http://www.example.com/%E2%80%A5/foo"
    },{
         "in" : "http://search.barnesandnoble.com/booksearch/first book.pdf", 
         "out" : "http://search.barnesandnoble.com/booksearch/first%20book.pdf"
    }, {
         "in" : "http://example.com/query?q=random word £500 bank $", 
         "out" : "http://example.com/query?q=random%20word%20%C2%A3500%20bank%20$"
    }
    

    The solution passes around 100 of the testcases provided by Web Plattform Tests.

提交回复
热议问题