I am looking for a command in sed
which transforms this input stream:
dummy
(key1)
(key2)dummy(key3)
dummy(key4)dummy
dummy(key5)dummy))))dummy
In Perl, you can use Marpa, a general BNF parser — the parser code is in this gist.
BNF parser is arguably more maintainable than a regex. Parens around grammar symbols hide their values from the parse tree thus simplifying the post-processing.
Hope this helps.
Perlishly I'd do:
my @all_keys;
while ( <DATA> ) {
push ( @all_keys, m/\((.+?)\)/g );
}
print join ("\n",@all_keys);
__DATA__
dummy
(key1)
(key2)dummy(key3)
dummy(key4)dummy
dummy(key5)dummy))))dummy
dummy(key6)dummy))(key7)dummy))))
This assumes that 'keys' match the \w
in perlre (alphanumeric plus "_",)
(If you're not familiar with perl, you can pretty much just swap that <DATA>
for <STDIN>
and pipe the data straight to your script - or do more interesting things with @all_keys
)
You can use this lookbehind based regex in grep -oP
:
grep -oP '(?<=\()[^)]+' file
key1
key2
key3
key4
key5
key6
key7
Or using awk
:
awk -F '[()]' 'NF>1{for(i=2; i<=NF; i+=2) if ($i) print $i}' file
key1
key2
key3
key4
key5
key6
key7