URI - getHost returns null. Why?

前端 未结 5 2009
春和景丽
春和景丽 2021-01-04 03:36

Why is the 1st one returning null, while the 2nd one is returning mail.yahoo.com?

Isn\'t this weird? If not, what\'s the logic behind this

相关标签:
5条回答
  • 2021-01-04 04:07

    As mentioned in comments by @hsz it is known bug.

    But, let's debug and look inside sources of URI class. The problem is inside the method:

    private int parseHostname(int start, int n):

    parsing first URI fails at lines if ((p < n) && !at(p, n, ':')) fail("Illegal character in hostname", p);

    this is because _ symbol isn't foreseed inside scan block, it allows only alphas, digits and -symbol (L_ALPHANUM, H_ALPHANUM, L_DASH and H_DASH).

    And yes, this is not fixed yet in Java 7.

    0 讨论(0)
  • 2021-01-04 04:31

    I don't think it's a bug in Java, I think Java is parsing hostnames correctly according to the spec, there are good explanations of the spec here: http://en.wikipedia.org/wiki/Hostname#Restrictions_on_valid_host_names and here: http://www.netregister.biz/faqit.htm#1

    Specifically hostnames MUST NOT contain underscores.

    0 讨论(0)
  • 2021-01-04 04:32

    It's because of underscore in base uri. Just Remove underscore to check that out.It's working.

    Like given below :

    public static void main(String[] args) throws Exception {
    java.net.URI uri = new java.net.URI("http://brokenarrow.huntingtonhelps.com");
    String host = uri.getHost();
    System.out.println("Host = [" + host + "].");
    
    uri = new java.net.URI("http://mail.yahoo.com");
    host = uri.getHost();
    System.out.println("Host = [" + host + "].");
    

    }

    0 讨论(0)
  • 2021-01-04 04:32

    As mentioned, it is a known JVM bug. Although, if you want to do an HTTP request to such a host, you still can try to use a workaround. The main idea is to construct request basing on the IP, not on the 'wrong' hostname. But in that case you also need to add "Host" header to the request, with the correct (original) hostname.

    1: Cut hostname from the URL (it's a rough example, you can use some more smart way):

    int n = url.indexOf("://");  
    if (n > 0) { n += 3; } else { n = 0; }  
    int m = url.indexOf(":", n);
    int k = url.indexOf("/", n);  
    if (-1 == m) { m = k; }  
    String hostHeader;  
    if (k > -1) {  
      hostHeader = url.substring(n, k);  
    } else {  
      hostHeader = url.substring(n);  
    }
    String hostname;  
    if (m > -1) {  
      hostname = url.substring(n, m);  
    } else {  
      hostname = url.substring(n);  
    }  
    

    2: Get hostname's IP:

    String IP = InetAddress.getByName(hostname).getHostAddress();
    

    3: Construct new URL basing on the IP:

    String newURL = url.substring(0, n) + IP + url.substring(m);
    

    4: Now use an HTTP library for preparing request on the new URL (pseudocode):

    HttpRequest req = ApacheHTTP.get(newUrl);
    

    5: And now you should add "Host" header with the correct (original) hostname:

    req.addHeader("Host", hostHeader);
    

    6: Now you can do the request (pseudocode):

    String resp = req.getResponse().asString();
    
    0 讨论(0)
  • 2021-01-04 04:33

    Consider using: new java.net.URL("http://broken_arrow.huntingtonhelps.com").getHost() instead. It has alternative parsing implementation. If you have an URI myUri instance, then call myUri.toURL().getHost().

    I faced this URI issue in OpenJDK 1.8 and it worked fine with URL.

    0 讨论(0)
提交回复
热议问题