I have a log file with a lot of lines on this format:
10.87.113.12 - - [2019-12-09T11:41:07.197Z] \"DELETE /page/sub1.php?id=alice HTTP/1.1\" 401 275 \"-\" \"ali
Could you please try following, this should be an easy task for awk
in case you are ok with awk
.
awk '
/alice/ && match($0,/jw_token=[^ ]* HTTP\/1\.1\" 200/){
val=substr($0,RSTART+9,RLENGTH-9)
split(val,array," ")
print array[1]
delete array
}' Input_file
You can achieve this by using just one grep and sed with this command,
grep -E 'id=alice&jw_token=.* HTTP\/1.1" 200' main.log|sed -E 's/.*id=alice&jw_token=([a-zA-Z0-9]+).*/\1/'|uniq
Here first part grep -E 'id=alice&jw_token=.* HTTP\/1.1" 200' main.log
will filter out all lines not having alice and not having status 200 and next sed -E 's/.*id=alice&jw_token=([a-zA-Z0-9]+).*/\1/'
part will just capture the token in group1 and replace whole line with just the token.
If you're open to a perl oneliner:
perl -ane '/id=alice&jw_token=([a-f0-9]+).+\b200\b/ && $h{$1}++;END{print"$_\n" for sort(keys %h)}' file.txt
07e876afdc2245b53214fff0d4763730
Explanation:
/ # regex delimiter
id=alice&jw_token= # literally
([a-f0-9]+) # group 1, 1 or more hexa
.+ # 1 or more any character
\b200\b # 200 surrounded with word boundaries
/ # regex delimiter, you may use /i for case insensitive
Would you try the following:
grep "id=alice.* 200 " main.log | sed 's/.*jw_token=\([^ ]\{1,\}\).*/\1/' | uniq