Inet6Address valid for invalid IPv6 Address

巧了我就是萌 提交于 2020-01-14 03:16:16

问题


I'm using java.net.Inet6Address to validate if an input string is a valid IPv6 address or not.

Here is my code snippet:

public static boolean isValidIPv6ddress(String address) {
    if (address.isEmpty()) {
        return false;
    }
    try {
        Object res = InetAddress.getByName(address);
        return res instanceof Inet6Address;
    } catch (final UnknownHostException ex) {
        return false;
    }
}

Unfortunately the above method returns true even for the following inputs which are invalid:

System.out.println(isValidIPv6ddress("2A00:17C8:50C:0000:0000:0000:0000:00001"));
System.out.println(isValidIPv6ddress("2A00:17C8:50C:0000:0000:00000000000000:0000:00001"));
System.out.println(isValidIPv6ddress("2A00:17C8:50C:00001235:0000:00000000000000:0000:00001"));

Does the API ignore leading zeroes? Or is there a bug in the API?


回答1:


Based on the most current RFCs regarding the valid text representations of IPv6 addresses, you have encountered a bug, or a poor interpretation of IPv6 address text representation. The most current RFC for IPv6 address architecture is RFC 4291, IP Version 6 Addressing Architecture. That RFC has Section 2.2. Text Representation of Addresses that says (note that the limit is four hexadecimal digits):

2.2. Text Representation of Addresses

There are three conventional forms for representing IPv6 addresses as text strings:

  1. The preferred form is x:x:x:x:x:x:x:x, where the 'x's are one to four hexadecimal digits of the eight 16-bit pieces of the address. Examples:

     ABCD:EF01:2345:6789:ABCD:EF01:2345:6789
    
     2001:DB8:0:0:8:800:200C:417A
    

    Note that it is not necessary to write the leading zeros in an individual field, but there must be at least one numeral in every field (except for the case described in 2.).

  2. Due to some methods of allocating certain styles of IPv6 addresses, it will be common for addresses to contain long strings of zero bits. In order to make writing addresses containing zero bits easier, a special syntax is available to compress the zeros. The use of "::" indicates one or more groups of 16 bits of zeros. The "::" can only appear once in an address. The "::" can also be used to compress leading or trailing zeros in an address.

    For example, the following addresses

     2001:DB8:0:0:8:800:200C:417A   a unicast address
     FF01:0:0:0:0:0:0:101           a multicast address
     0:0:0:0:0:0:0:1                the loopback address
     0:0:0:0:0:0:0:0                the unspecified address
    

    may be represented as

     2001:DB8::8:800:200C:417A      a unicast address
     FF01::101                      a multicast address
     ::1                            the loopback address
     ::                             the unspecified address
    
  3. An alternative form that is sometimes more convenient when dealing with a mixed environment of IPv4 and IPv6 nodes is x:x:x:x:x:x:d.d.d.d, where the 'x's are the hexadecimal values of the six high-order 16-bit pieces of the address, and the 'd's are the decimal values of the four low-order 8-bit pieces of the address (standard IPv4 representation). Examples:

     0:0:0:0:0:0:13.1.68.3
     0:0:0:0:0:FFFF:129.144.52.38
    

    or in compressed form:

     ::13.1.68.3
     ::FFFF:129.144.52.38
    

Since RFC 4291 demands that there be no more than four hexadecimal characters per 16-bit field, it would be incorrect to consider any IPv6 address text representation with more than four hexadecimal characters in a 16-bit field to be valid.


RFC 4291 was updated by RFC 5952, A Recommendation for IPv6 Address Text Representation, which further limited a proper representation in Section 4.1. Handling Leading Zeros in a 16-Bit Field to no leading zeros:

4.1. Handling Leading Zeros in a 16-Bit Field

Leading zeros MUST be suppressed. For example, 2001:0db8::0001 is not acceptable and must be represented as 2001:db8::1. A single 16-bit 0000 field MUST be represented as 0.

RFC 5952 Also requires the compressed format where more than one consecutive 16-bit field must be compressed to :::

4.2. "::" Usage

4.2.1. Shorten as Much as Possible

The use of the symbol "::" MUST be used to its maximum capability. For example, 2001:db8:0:0:0:0:2:1 must be shortened to 2001:db8::2:1. Likewise, 2001:db8::0:1 is not acceptable, because the symbol "::" could have been used to produce a shorter representation 2001:db8::1.

4.2.2. Handling One 16-Bit 0 Field

The symbol "::" MUST NOT be used to shorten just one 16-bit 0 field. For example, the representation 2001:db8:0:1:1:1:1:1 is correct, but 2001:db8::1:1:1:1:1 is not correct.

4.2.3. Choice in Placement of "::"

When there is an alternative choice in the placement of a "::", the longest run of consecutive 16-bit 0 fields MUST be shortened (i.e., the sequence with three consecutive zero fields is shortened in 2001: 0:0:1:0:0:0:1). When the length of the consecutive 16-bit 0 fields are equal (i.e., 2001:db8:0:0:1:0:0:1), the first sequence of zero bits MUST be shortened. For example, 2001:db8::1:0:0:1 is correct representation.

Basically, RFC 5952 is also requiring you to accept any valid RFC 4291 format, but you should only output any RFC 5952 formatted IPv6 text representation:

  1. A Recommendation for IPv6 Text Representation

    A recommendation for a canonical text representation format of IPv6 addresses is presented in this section. The recommendation in this document is one that complies fully with [RFC4291], is implemented by various operating systems, and is human friendly. The recommendation in this section SHOULD be followed by systems when generating an address to be represented as text, but all implementations MUST accept and be able to handle any legitimate [RFC4291] format. It is advised that humans also follow these recommendations when spelling an address.




回答2:


Looking at the source code, it looks like it breaks the string into characters, then for each it does a bit operation to add the bits to the final result and then shifts left 4 bits and continues until it either hits a : or the result goes over 0xFFFF. (http://www.docjar.com/html/api/sun/net/util/IPAddressUtil.java.html lines 170 through 180). Since the characters are all zeroes, obviously it will never exceed 0xFFFF, so it never hits that limit.

I also read RFC2373 and said leading zeroes are optional, but it doesn't explicitly put a limit on how many are permissible. I'd say it's a bug, but (IMHO) only for someone being pedantic.



来源:https://stackoverflow.com/questions/50846451/inet6address-valid-for-invalid-ipv6-address

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!