HTTP URL Address Encoding in Java

前端 未结 26 1348
醉酒成梦
醉酒成梦 2020-11-22 01:35

My Java standalone application gets a URL (which points to a file) from the user and I need to hit it and download it. The problem I am facing is that I am not able to encod

相关标签:
26条回答
  • 2020-11-22 02:08

    I had the same problem. Solved this by unsing:

    android.net.Uri.encode(urlString, ":/");
    

    It encodes the string but skips ":" and "/".

    0 讨论(0)
  • 2020-11-22 02:08

    String url=""http://search.barnesandnoble.com/booksearch/;

    This will be constant i guess and only filename changes dyamically so get filename

    String filename; // get the file name

    String urlEnc=url+fileName.replace(" ","%20");

    0 讨论(0)
  • 2020-11-22 02:10

    Please be warned that most of the answers above are INCORRECT.

    The URLEncoder class, despite is name, is NOT what needs to be here. It's unfortunate that Sun named this class so annoyingly. URLEncoder is meant for passing data as parameters, not for encoding the URL itself.

    In other words, "http://search.barnesandnoble.com/booksearch/first book.pdf" is the URL. Parameters would be, for example, "http://search.barnesandnoble.com/booksearch/first book.pdf?parameter1=this&param2=that". The parameters are what you would use URLEncoder for.

    The following two examples highlights the differences between the two.

    The following produces the wrong parameters, according to the HTTP standard. Note the ampersand (&) and plus (+) are encoded incorrectly.

    uri = new URI("http", null, "www.google.com", 80, 
    "/help/me/book name+me/", "MY CRZY QUERY! +&+ :)", null);
    
    // URI: http://www.google.com:80/help/me/book%20name+me/?MY%20CRZY%20QUERY!%20+&+%20:)
    

    The following will produce the correct parameters, with the query properly encoded. Note the spaces, ampersands, and plus marks.

    uri = new URI("http", null, "www.google.com", 80, "/help/me/book name+me/", URLEncoder.encode("MY CRZY QUERY! +&+ :)", "UTF-8"), null);
    
    // URI: http://www.google.com:80/help/me/book%20name+me/?MY+CRZY+QUERY%2521+%252B%2526%252B+%253A%2529
    
    0 讨论(0)
  • 2020-11-22 02:10

    Maybe can try UriUtils in org.springframework.web.util

    UriUtils.encodeUri(input, "UTF-8")
    
    0 讨论(0)
  • 2020-11-22 02:12

    Yeah URL encoding is going to encode that string so that it would be passed properly in a url to a final destination. For example you could not have http://stackoverflow.com?url=http://yyy.com. UrlEncoding the parameter would fix that parameter value.

    So i have two choices for you:

    1. Do you have access to the path separate from the domain? If so you may be able to simply UrlEncode the path. However, if this is not the case then option 2 may be for you.

    2. Get commons-httpclient-3.1. This has a class URIUtil:

      System.out.println(URIUtil.encodePath("http://example.com/x y", "ISO-8859-1"));

    This will output exactly what you are looking for, as it will only encode the path part of the URI.

    FYI, you'll need commons-codec and commons-logging for this method to work at runtime.

    0 讨论(0)
  • 2020-11-22 02:12

    There is still a problem if you have got an encoded "/" (%2F) in your URL.

    RFC 3986 - Section 2.2 says: "If data for a URI component would conflict with a reserved character's purpose as a delimiter, then the conflicting data must be percent-encoded before the URI is formed." (RFC 3986 - Section 2.2)

    But there is an Issue with Tomcat:

    http://tomcat.apache.org/security-6.html - Fixed in Apache Tomcat 6.0.10

    important: Directory traversal CVE-2007-0450

    Tomcat permits '\', '%2F' and '%5C' [...] .

    The following Java system properties have been added to Tomcat to provide additional control of the handling of path delimiters in URLs (both options default to false):

    • org.apache.tomcat.util.buf.UDecoder.ALLOW_ENCODED_SLASH: true|false
    • org.apache.catalina.connector.CoyoteAdapter.ALLOW_BACKSLASH: true|false

    Due to the impossibility to guarantee that all URLs are handled by Tomcat as they are in proxy servers, Tomcat should always be secured as if no proxy restricting context access was used.

    Affects: 6.0.0-6.0.9

    So if you have got an URL with the %2F character, Tomcat returns: "400 Invalid URI: noSlash"

    You can switch of the bugfix in the Tomcat startup script:

    set JAVA_OPTS=%JAVA_OPTS% %LOGGING_CONFIG%   -Dorg.apache.tomcat.util.buf.UDecoder.ALLOW_ENCODED_SLASH=true 
    
    0 讨论(0)
提交回复
热议问题