问题
I'm trying but failing to write a regex to grep for lines that do not begin with "//" (i.e. C++-style comments). I'm aware of the "grep -v" option, but I am trying to learn how to pull this off with regex alone. I've searched and found various answers on grepping for lines that don't begin with a character, and even one on how to grep for lines that don't begin with a string, but I'm unable to adapt those answers to my case, and I don't understand what my error is.
> cat bar.txt
hello
//world
> cat bar.txt | grep "(?!\/\/)"
-bash: !\/\/: event not found
I'm not sure what this "event not found" is about. One of the answers I found used paren-question mark-exclamation-string-paren, which I've done here, and which still fails.
> cat bar.txt | grep "^[^\/\/].+"
(no output)
Another answer I found used a caret within square brackets and explained that this syntax meant "search for the absence of what's in the square brackets (other than the caret). I think the ".+" means "one or more of anything", but I'm not sure if that's correct and if it is correct, what distinguishes it from ".*"
In a nutshell: how can I construct a regex to pass to grep to search for lines that do not begin with "//" ?
To be even more specific, I'm trying to search for lines that have "#include" that are not preceeded by "//".
Thank you.
回答1:
The first line tells you that the problem is from bash
(your shell). Bash finds the !
and attempts to inject into your command the last you entered that begins with \/\/
. To avoid this you need to escape the !
or use single quotes. For an example of !
, try !cat
, it will execute the last command beginning with cat
that you entered.
You don't need to escape /
, it has no special meaning in regular expressions. You also don't need to write a complicated regular expression to invert a match. Rather, just supply the -v
argument to grep. Most of the time simple is better. And you also don't need to cat the file to grep. Just give grep the file name. eg.
grep -v "^//" bar.txt | grep "#include"
If you're really hungup on using regular expressions then a simple one would look like (match start of string ^
, any number of white space [[:space:]]*
, exactly two backslashes /{2}
, any number of any characters .*
, followed by #include
):
grep -E "^[[:space:]]*/{2}.*#include" bar.txt
回答2:
- You're using negative lookahead which is PCRE feature and requires
-P
option - Your negative lookahead won't work without start anchor
- This will of course require
gnu-grep
. - You must use single quotes to use
!
in your regex otherwise history expansion is attempted with the text after!
in your regex, the reason of!\/\/: event not found
error.
So you can use:
grep -P '^(?!\h*//)' file
hello
\h
matches 0 or more horizontal whitespace.
Without -P
or non-gnu grep you can use grep -v
:
grep -v '^[[:blank:]]*//' file
hello
回答3:
To find #include
lines that are not preceded by //
(or /*
…), you can use:
grep '^[[:space:]]*#[[:space:]]*include[[:space:]]*["<]'
The regex looks for start of line, optional spaces, #
, optional spaces, include
, optional spaces and either "
or <
. It will find all #include
lines except lines such as #include MACRO_NAME
, which are legitimate but rare, and screwball cases such as:
#/*comment*/include/*comment*/<stdio.h>
#\
include\
<stdio.h>
If you have to deal with software containing such notations, (a) you have my sympathy and (b) fix the code to a more orthodox style before hunting the #include
lines. It will pick up false positives such as:
/* Do not include this:
#include <does-not-exist.h>
*/
You could omit the final [[:space:]]*["<]
with minimal chance of confusion, which will then pick up the macro name variant.
To find lines that do not start with a double slash, use -v
(to invert the match) and '^//'
to look for slashes at the start of a line:
grep -v '^//'
回答4:
You have to use the -P
(perl) option:
cat bar.txt | grep -P '(?!//)'
回答5:
For the lines not beginning with "//", you could use (^[^/]{2}.*$)
.
回答6:
If you don't like grep -v
for this then you could just use awk:
awk '!/^\/\//' file
Since awk supports compound conditions instead of just regexps, it's often easier to specify what you want to match with awk than grep, e.g. to search for a
and b
in any order with grep:
grep -E 'a.*b|b.*a`
while with awk:
awk '/a/ && /b/'
来源:https://stackoverflow.com/questions/36110886/grep-for-lines-not-beginning-with