Regex grep external IP brings back internal IP as well - why?

霸气de小男生 提交于 2020-01-24 17:45:40

问题


I have this grep operation here that gives you the external IP from the output of ifconfig:

ipa=$(ifconfig | grep -Po "inet addr:\K[^\s]+" | grep -v "^127")

I desire to use only one grep, so I tried the following, which was partly successful:

ipa=$(ifconfig | grep -Po "inet addr:\K[0-9]{1,3}?\.[0-9]{1,3}?\.[0-9]{1,3}?\.[0-9]{1,3}?")

It is partly successful because it also brings a space plus the internal IP, for some reason:

MY_IP_ADDRESS 127.0.0.1

Why is this happening? I mean, why are the space + loopback added as well, and what can be done to prevent that while still using a single grep, if at all? The loopback isn't even part of the relevant line of the ifconfig output.


回答1:


Given that you are already using grep -P, you can simply add a negative assertion:

ipa=$(ifconfig | grep -Po 'inet addr:\K(?!127\.)\d{1,3}.\d{1,3}\.\d{1,3}\.\d{1,3}')

Your original question's regex (hence edited) would also accept zero numbers between the dots; I fixed that as well and simplified the result for hopefully slightly improved legibility.

The \K is a Perl innovation which says "if you match through to here, forget the text which got to this point" which means the match on inet addr: will not be included in the "matched text" printed by grep -o.

The expression (?!127\.) is a negative lookahead assertion. In brief, it says "if this regex would match now, this is not a match". In other words, the regex engine takes a brief pause, takes a note of where it is in the text, and "peeks ahead" and attempts to match 127\.. If that succeeds, it gives up on attempting to match at this point, and proceeds to attempt to match the entire expression at a later point in the string (so if it were to find a second occurrence of inet addr: later on in the same line, you could still get a match from there).

Finally, I switched the quoting to single quotes. It doesn't really matter a lot here, but I recommend single quotes around all regular expressions unless you specifically require the shell to perform variable replacements in the regex or something like that.

As for explaining what you saw, there is no space in the output really. The grep outputs two lines because it finds two matches (which of course we now prevent with the negative lookahead; but if you have multiple interfaces configured, you could still get more than one result). If you are seeing a space, that's because you didn't use double quotes when echoing, as in echo "$ipa".

As noted in comments, if you get bash: !127: event not found, you need to set +H or put the commands in a script; or, use single quotes like I recommend in the previous paragraph. Unless you are addicted to the legacy Csh-style history management features in Bash (and seriously, who is, these days?), I recommend you make this change permanent by putting the command set +H in your .bash_profile or similar.

Optional: Refactor the Regex

You could refactor your regex to make it more compact but perhaps slightly less legible:

ipa=$(ifconfig | grep -Po 'inet addr:\K(?!127\.)\d{1,3}(?:.\d{1,3}){3}')

An even shorter way would be this:

ipa=$(ifconfig | grep -Po 'inet addr:\K(?!127\.)[.\d]+')

Note the same \K and (?!127\.) patterns, but also the new [.\d]+ which replaces the \d{1,3}.\d{1,3}\.\d{1,3}\.\d{1,3}') pattern. This is slightly less precise, but probably good enough for this scenario. If your input comes from ifconfig and you have already seen the inet addr: signpost, matching as many digits and dots as possible should always get you the IP address you are looking for.

Depending on what you need this for, you could still add more things to block in the lookahead. To prevent it from also matching internal networks, something like

(?!127\.|10\.|172\.(?:1[6-9]|2[0-9]|3[01])|192\.168\.)

would prevent extraction of addresses in all IANA-reserved private network blocks, including loopback.




回答2:


There are a few ways to achieve this, using either ifconfig, ip, dig, or my personal favorite myip. Furthermore, there are even more ways to optimize your regex, many of which you have probably already seen in the comments of your previous question.

But, to answer you literally, without rewriting your command or imposing personal preference, you can achieve the desired result of excluding the loopback address by simply specifying the interface you do want to obtain as the first argument to ifconfig. By default (i.e. no args), ifconfig displays the status of all currently active interfaces.

Something like this should suffice:

# Replace "eth0" with the appropriately configured static inet address' interface
# ... is your `grep` pipe
ifconfig "eth0" ...

man ifconfig

If no arguments are given, ifconfig displays the status of the currently active interfaces.




回答3:


As noted by Triplee in the comments (Triplee's answer should be upvoted):

  1. The regex matched the loopback as well, as it's also an IP with a init addr: precursor.
  2. The space was added due to me doing echo $ipa instead "echo "$ipa".

Given I got the loopback as well because the regex matched it as well (I didn't notice it in the start given it was near the end of the ifconfig output, what I did was to use grep -m1 argument. This argument makes grep to bring only the first match (and the external IP is indeed earlier and found first), so the end command is:

ipa=$(ifconfig | grep -Po -m1 "inet addr:\K[0-9]{1,3}?\.[0-9]{1,3}?\.[0-9]{1,3}?\.[0-9]{1,3}?")

And yet, as goes up form Triplee's comment, it's problematic by principle to assume that the first match would be the external IP and not the loopback --- ifconfig might change tomorrow, to have the loopback as first, so one should use either this single grep solution by Triplee that involves negative assertion:

ipa=$(ifconfig | grep -Po 'inet addr:\K(?!127\.)\d{1,3}.\d{1,3}\.\d{1,3}\.\d{1,3}')

Or a shorter alternative by Triplee that also invloves a history expansion prevention in an interactive shell by executing set +H:

ipa=$(ifconfig | grep -Po 'inet addr:\K(?!127\.)[.\d]+')

Note: Execution of set +H is needed in case of error. There's no problem to keep this state though it can be revoked with set -H.

Either way, another minimal approach is the original 2 grep approach I published in the question:

ipa=$(ifconfig | grep -Po "inet addr:\K[^\s]+" | grep -v "^127")



回答4:


Excluding adresses that start with 127.:

ifconfig | grep -Po '\binet addr:\K(?!127\.)\S+'

Excluding lo adapter:

ifconfig | perl -nle'BEGIN { $/="" } next if /^lo\b/; print for /\binet addr:(\S+)/g'

Just a specific adapter:

ifconfig eth1 | grep -Po '\binet addr:\K\S+'

Just the first address of the ethernet adapter with one:

ifconfig | perl -nle'BEGIN { $/="" } if (/^eth.*?\binet addr:(\S+)/s) { print $1; exit; }'


来源:https://stackoverflow.com/questions/47784982/regex-grep-external-ip-brings-back-internal-ip-as-well-why

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!