Well, I have a file test.txt
#test.txt odsdsdoddf112 test1_for_grep dad23392eeedJ test2 for grep Hello World test garbage
I want to extract strings which h
If we want to extract all meaningful input before garbage and actually stop on first match then -B NUM, --before-context=NUM
option may be useful to "print NUM lines of leading context before matching lines".
Example:
grep --before-context=999999 "Hello World test"
grep -oe "^[^ ]* " test.txt
If you're sure you have no leading whitespace, add a ^
to match only at the start of a line, and change the *
to a +
to match only when you have one or more alphanumeric characters. (That means adding -E
to use extended regular expressions).
grep -Eo "^[[:alnum:]]+[[:blank:]]" test.txt
(I also removed the .
from the middle; I'm not sure what that was doing there?)
As the questioner discovered, this is a bug in versions of GNU grep prior to 2.5.3. The bug allows a caret to match after the end of a previous match, not just at beginning of line.
This bug is still present in other versions of grep, for instance in Mac OS X 10.9.4.
There isn't a universal workaround, but in the some examples, like non-spaces followed by a space, you can often get the desired behavior by leaving off the delimiter. That is, search for '[^ ]*'
rather than '[^ ]* '
.