How do you extract IP addresses from files using a regex in a linux shell?

前端 未结 19 1584
被撕碎了的回忆
被撕碎了的回忆 2020-11-28 02:43

How to extract a text part by regexp in linux shell? Lets say, I have a file where in every line is an IP address, but on a different position. What is the simplest way to e

相关标签:
19条回答
  • 2020-11-28 03:14

    I wrote an informative blog article about this topic: How to Extract IPv4 and IPv6 IP Addresses from Plain Text Using Regex.

    In the article there's a detailed guide of the most common different patterns for IPs, often required to be extracted and isolated from plain text using regular expressions.
    This guide is based on CodVerter's IP Extractor source code tool for handling IP addresses extraction and detection when necessary.

    If you wish to validate and capture IPv4 Address this pattern can do the job:

    \b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)[.]){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b
    

    or to validate and capture IPv4 Address with Prefix ("slash notation"):

    \b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)[.]){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?/[0-9]{1,2})\b
    

    or to capture subnet mask or wildcard mask:

    (255|254|252|248|240|224|192|128|0)[.](255|254|252|248|240|224|192|128|0)[.](255|254|252|248|240|224|192|128|0)[.](255|254|252|248|240|224|192|128|0)
    

    or to filter out subnet mask addresses you do it with regex negative lookahead:

    \b((?!(255|254|252|248|240|224|192|128|0)[.](255|254|252|248|240|224|192|128|0)[.](255|254|252|248|240|224|192|128|0)[.](255|254|252|248|240|224|192|128|0)))(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)[.]){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b
    

    For IPv6 validation you can go to the article link I have added at the top of this answer.
    Here is an example for capturing all the common patterns (taken from CodVerter`s IP Extractor Help Sample):

    If you wish you can test the IPv4 regex here.

    0 讨论(0)
  • 2020-11-28 03:14

    You could use awk, as well. Something like ...

    awk '{i=1; if (NF > 0) do {if ($i ~ /regexp/) print $i; i++;} while (i <= NF);}' file
    

    May require cleaning. just a quick and dirty response to shows basically how to do it with awk.

    0 讨论(0)
  • 2020-11-28 03:14

    If you are not given a specific file and you need to extract IP address then we need to do it recursively. grep command -> Searches a text or file for matching a given string and displays the matched string .

    grep -roE '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' | grep -oE '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}'
    

    -r We can search the entire directory tree i.e. the current directory and all levels of sub-directories. It denotes recursive searching.

    -o Print only the matching string

    -E Use extended regular expression

    If we would not have used the second grep command after the pipe we would have got the IP address along with the path where it is present

    0 讨论(0)
  • 2020-11-28 03:16

    You can use sed. But if you know perl, that might be easier, and more useful to know in the long run:

    perl -n '/(\d+\.\d+\.\d+\.\d+)/ && print "$1\n"' < file
    
    0 讨论(0)
  • 2020-11-28 03:16
    cat ip_address.txt | grep '^[0-9]\{1,3\}[.][0-9]\{1,3\}[.][0-9]\{1,3\}[.][0-9]\{1,3\}[,].*$\|^.*[,][0-9]\{1,3\}[.][0-9]\{1,3\}[.][0-9]\{1,3\}[.][0-9]\{1,3\}[,].*$\|^.*[,][0-9]\{1,3\}[.][0-9]\{1,3\}[.][0-9]\{1,3\}[.][0-9]\{1,3\}$'
    

    Lets assume the file is comma delimited and the position of ip address in the beginning ,end and somewhere in the middle

    First regexp looks for the exact match of ip address in the beginning of the line. The second regexp after the or looks for ip address in the middle.we are matching it in such a way that the number that follows ,should be exactly 1 to 3 digits .falsy ips like 12345.12.34.1 can be excluded in this.

    The third regexp looks for the ip address at the end of the line

    0 讨论(0)
  • 2020-11-28 03:19

    Most of the examples here will match on 999.999.999.999 which is not technically a valid IP address.

    The following will match on only valid IP addresses (including network and broadcast addresses).

    grep -E -o '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)' file.txt
    

    Omit the -o if you want to see the entire line that matched.

    0 讨论(0)
提交回复
热议问题