Regular expression to match DNS hostname or IP Address?

后端 未结 21 2534
长发绾君心
长发绾君心 2020-11-21 07:25

Does anyone have a regular expression handy that will match any legal DNS hostname or IP address?

It\'s easy to write one that works 95% of the time, but I\'m hoping

相关标签:
21条回答
  • 2020-11-21 07:49

    Checking for host names like... mywebsite.co.in, thangaraj.name, 18thangaraj.in, thangaraj106.in etc.,

    [a-z\d+].*?\\.\w{2,4}$
    
    0 讨论(0)
  • 2020-11-21 07:50

    It's worth noting that there are libraries for most languages that do this for you, often built into the standard library. And those libraries are likely to get updated a lot more often than code that you copied off a Stack Overflow answer four years ago and forgot about. And of course they'll also generally parse the address into some usable form, rather than just giving you a match with a bunch of groups.

    For example, detecting and parsing IPv4 in (POSIX) C:

    #include <arpa/inet.h>
    #include <stdio.h>
    
    int main(int argc, char *argv[]) {
      for (int i=1; i!=argc; ++i) {
        struct in_addr addr = {0};
        printf("%s: ", argv[i]);
        if (inet_pton(AF_INET, argv[i], &addr) != 1)
          printf("invalid\n");
        else
          printf("%u\n", addr.s_addr);
      }
      return 0;
    }
    

    Obviously, such functions won't work if you're trying to, e.g., find all valid addresses in a chat message—but even there, it may be easier to use a simple but overzealous regex to find potential matches, and then use the library to parse them.

    For example, in Python:

    >>> import ipaddress
    >>> import re
    >>> msg = "My address is 192.168.0.42; 192.168.0.420 is not an address"
    >>> for maybeip in re.findall(r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}', msg):
    ...     try:
    ...         print(ipaddress.ip_address(maybeip))
    ...     except ValueError:
    ...         pass
    
    0 讨论(0)
  • 2020-11-21 07:51

    You can use the following regular expressions separately or by combining them in a joint OR expression.

    ValidIpAddressRegex = "^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$";
    
    ValidHostnameRegex = "^(([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9])$";
    

    ValidIpAddressRegex matches valid IP addresses and ValidHostnameRegex valid host names. Depending on the language you use \ could have to be escaped with \.


    ValidHostnameRegex is valid as per RFC 1123. Originally, RFC 952 specified that hostname segments could not start with a digit.

    http://en.wikipedia.org/wiki/Hostname

    The original specification of hostnames in RFC 952, mandated that labels could not start with a digit or with a hyphen, and must not end with a hyphen. However, a subsequent specification (RFC 1123) permitted hostname labels to start with digits.

    Valid952HostnameRegex = "^(([a-zA-Z]|[a-zA-Z][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z]|[A-Za-z][A-Za-z0-9\-]*[A-Za-z0-9])$";
    
    0 讨论(0)
  • 2020-11-21 07:57
    def isValidHostname(hostname):
    
        if len(hostname) > 255:
            return False
        if hostname[-1:] == ".":
            hostname = hostname[:-1]   # strip exactly one dot from the right,
                                       #  if present
        allowed = re.compile("(?!-)[A-Z\d-]{1,63}(?<!-)$", re.IGNORECASE)
        return all(allowed.match(x) for x in hostname.split("."))
    
    0 讨论(0)
  • 2020-11-21 07:57

    I found this works pretty well for IP addresses. It validates like the top answer but it also makes sure the ip is isolated so no text or more numbers/decimals are after or before the ip.

    (?<!\S)(?:(?:\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\b|.\b){7}(?!\S)

    0 讨论(0)
  • 2020-11-21 07:57

    The new Network framework has failable initializers for struct IPv4Address and struct IPv6Address which handle the IP address portion very easily. Doing this in IPv6 with a regex is tough with all the shortening rules.

    Unfortunately I don't have an elegant answer for hostname.

    Note that Network framework is recent, so it may force you to compile for recent OS versions.

    import Network
    let tests = ["192.168.4.4","fkjhwojfw","192.168.4.4.4","2620:3","2620::33"]
    
    for test in tests {
        if let _ = IPv4Address(test) {
            debugPrint("\(test) is valid ipv4 address")
        } else if let _ = IPv6Address(test) {
            debugPrint("\(test) is valid ipv6 address")
        } else {
            debugPrint("\(test) is not a valid IP address")
        }
    }
    
    output:
    "192.168.4.4 is valid ipv4 address"
    "fkjhwojfw is not a valid IP address"
    "192.168.4.4.4 is not a valid IP address"
    "2620:3 is not a valid IP address"
    "2620::33 is valid ipv6 address"
    
    0 讨论(0)
提交回复
热议问题