HTTP URL Address Encoding in Java

前端 未结 26 1350
醉酒成梦
醉酒成梦 2020-11-22 01:35

My Java standalone application gets a URL (which points to a file) from the user and I need to hit it and download it. The problem I am facing is that I am not able to encod

相关标签:
26条回答
  • 2020-11-22 02:18

    I develop a library that serves this purpose: galimatias. It parses URL the same way web browsers do. That is, if a URL works in a browser, it will be correctly parsed by galimatias.

    In this case:

    // Parse
    io.mola.galimatias.URL.parse(
        "http://search.barnesandnoble.com/booksearch/first book.pdf"
    ).toString()
    

    Will give you: http://search.barnesandnoble.com/booksearch/first%20book.pdf. Of course this is the simplest case, but it'll work with anything, way beyond java.net.URI.

    You can check it out at: https://github.com/smola/galimatias

    0 讨论(0)
  • 2020-11-22 02:23

    I'm going to add one suggestion here aimed at Android users. You can do this which avoids having to get any external libraries. Also, all the search/replace characters solutions suggested in some of the answers above are perilous and should be avoided.

    Give this a try:

    String urlStr = "http://abc.dev.domain.com/0007AC/ads/800x480 15sec h.264.mp4";
    URL url = new URL(urlStr);
    URI uri = new URI(url.getProtocol(), url.getUserInfo(), url.getHost(), url.getPort(), url.getPath(), url.getQuery(), url.getRef());
    url = uri.toURL();
    

    You can see that in this particular URL, I need to have those spaces encoded so that I can use it for a request.

    This takes advantage of a couple features available to you in Android classes. First, the URL class can break a url into its proper components so there is no need for you to do any string search/replace work. Secondly, this approach takes advantage of the URI class feature of properly escaping components when you construct a URI via components rather than from a single string.

    The beauty of this approach is that you can take any valid url string and have it work without needing any special knowledge of it yourself.

    0 讨论(0)
  • 2020-11-22 02:24

    If anybody doesn't want to add a dependency to their project, these functions may be helpful.

    We pass the 'path' part of our URL into here. You probably don't want to pass the full URL in as a parameter (query strings need different escapes, etc).

    /**
     * Percent-encodes a string so it's suitable for use in a URL Path (not a query string / form encode, which uses + for spaces, etc)
     */
    public static String percentEncode(String encodeMe) {
        if (encodeMe == null) {
            return "";
        }
        String encoded = encodeMe.replace("%", "%25");
        encoded = encoded.replace(" ", "%20");
        encoded = encoded.replace("!", "%21");
        encoded = encoded.replace("#", "%23");
        encoded = encoded.replace("$", "%24");
        encoded = encoded.replace("&", "%26");
        encoded = encoded.replace("'", "%27");
        encoded = encoded.replace("(", "%28");
        encoded = encoded.replace(")", "%29");
        encoded = encoded.replace("*", "%2A");
        encoded = encoded.replace("+", "%2B");
        encoded = encoded.replace(",", "%2C");
        encoded = encoded.replace("/", "%2F");
        encoded = encoded.replace(":", "%3A");
        encoded = encoded.replace(";", "%3B");
        encoded = encoded.replace("=", "%3D");
        encoded = encoded.replace("?", "%3F");
        encoded = encoded.replace("@", "%40");
        encoded = encoded.replace("[", "%5B");
        encoded = encoded.replace("]", "%5D");
        return encoded;
    }
    
    /**
     * Percent-decodes a string, such as used in a URL Path (not a query string / form encode, which uses + for spaces, etc)
     */
    public static String percentDecode(String encodeMe) {
        if (encodeMe == null) {
            return "";
        }
        String decoded = encodeMe.replace("%21", "!");
        decoded = decoded.replace("%20", " ");
        decoded = decoded.replace("%23", "#");
        decoded = decoded.replace("%24", "$");
        decoded = decoded.replace("%26", "&");
        decoded = decoded.replace("%27", "'");
        decoded = decoded.replace("%28", "(");
        decoded = decoded.replace("%29", ")");
        decoded = decoded.replace("%2A", "*");
        decoded = decoded.replace("%2B", "+");
        decoded = decoded.replace("%2C", ",");
        decoded = decoded.replace("%2F", "/");
        decoded = decoded.replace("%3A", ":");
        decoded = decoded.replace("%3B", ";");
        decoded = decoded.replace("%3D", "=");
        decoded = decoded.replace("%3F", "?");
        decoded = decoded.replace("%40", "@");
        decoded = decoded.replace("%5B", "[");
        decoded = decoded.replace("%5D", "]");
        decoded = decoded.replace("%25", "%");
        return decoded;
    }
    

    And tests:

    @Test
    public void testPercentEncode_Decode() {
        assertEquals("", percentDecode(percentEncode(null)));
        assertEquals("", percentDecode(percentEncode("")));
    
        assertEquals("!", percentDecode(percentEncode("!")));
        assertEquals("#", percentDecode(percentEncode("#")));
        assertEquals("$", percentDecode(percentEncode("$")));
        assertEquals("@", percentDecode(percentEncode("@")));
        assertEquals("&", percentDecode(percentEncode("&")));
        assertEquals("'", percentDecode(percentEncode("'")));
        assertEquals("(", percentDecode(percentEncode("(")));
        assertEquals(")", percentDecode(percentEncode(")")));
        assertEquals("*", percentDecode(percentEncode("*")));
        assertEquals("+", percentDecode(percentEncode("+")));
        assertEquals(",", percentDecode(percentEncode(",")));
        assertEquals("/", percentDecode(percentEncode("/")));
        assertEquals(":", percentDecode(percentEncode(":")));
        assertEquals(";", percentDecode(percentEncode(";")));
    
        assertEquals("=", percentDecode(percentEncode("=")));
        assertEquals("?", percentDecode(percentEncode("?")));
        assertEquals("@", percentDecode(percentEncode("@")));
        assertEquals("[", percentDecode(percentEncode("[")));
        assertEquals("]", percentDecode(percentEncode("]")));
        assertEquals(" ", percentDecode(percentEncode(" ")));
    
        // Get a little complex
        assertEquals("[]]", percentDecode(percentEncode("[]]")));
        assertEquals("a=d%*", percentDecode(percentEncode("a=d%*")));
        assertEquals(")  (", percentDecode(percentEncode(")  (")));
        assertEquals("%21%20%2A%20%27%20%28%20%25%20%29%20%3B%20%3A%20%40%20%26%20%3D%20%2B%20%24%20%2C%20%2F%20%3F%20%23%20%5B%20%5D%20%25",
                        percentEncode("! * ' ( % ) ; : @ & = + $ , / ? # [ ] %"));
        assertEquals("! * ' ( % ) ; : @ & = + $ , / ? # [ ] %", percentDecode(
                        "%21%20%2A%20%27%20%28%20%25%20%29%20%3B%20%3A%20%40%20%26%20%3D%20%2B%20%24%20%2C%20%2F%20%3F%20%23%20%5B%20%5D%20%25"));
    
        assertEquals("%23456", percentDecode(percentEncode("%23456")));
    
    }
    
    0 讨论(0)
  • 2020-11-22 02:24

    You can also use GUAVA and path escaper: UrlEscapers.urlFragmentEscaper().escape(relativePath)

    0 讨论(0)
  • 2020-11-22 02:28

    Nitpicking: a string containing a whitespace character by definition is not a URI. So what you're looking for is code that implements the URI escaping defined in Section 2.1 of RFC 3986.

    0 讨论(0)
  • 2020-11-22 02:28

    In addition to the Carlos Heuberger's reply: if a different than the default (80) is needed, the 7 param constructor should be used:

    URI uri = new URI(
            "http",
            null, // this is for userInfo
            "www.google.com",
            8080, // port number as int
            "/ig/api",
            "weather=São Paulo",
            null);
    String request = uri.toASCIIString();
    
    0 讨论(0)
提交回复
热议问题