grep -Ff producing invalid output

五迷三道 提交于 2019-12-13 11:26:35

问题


I'm using

code -

 grep -Ff list.txt C:/data/*.txt > found.txt

but it keeps outputting invalid responses, lines don't contain the emails i input..

list.txt contains -

email@email.com
customer@email.com
imadmin@gmail.com
newcustomer@email.com
helloworld@yes.com

and so on.. email to match on each line,

search files contain -

user1:phonenumber1:email@email.com:last-active:recent
user2:phonennumber2:customer@email.com:last-active:inactive
user3:phonenumber3:blablarandom@bla.com:last-active:never

then another may contain -

blublublu         email@email.com         phonenumber         subscribed
nanananana        customer@email.com      phonenumber         unsubscribed
useruser          noemailinput@noemail.com       phonenumber      pending

so what I'm trying to do is present grep with a list of emails/list of strings " list.txt " and to then search the directory provided for matches of each string and output the entire line that contains each match.

example of output in this case would be -

user1:phonenumber1:email@email.com:last-active:recent
user2:phonennumber2:customer@email.com:last-active:inactive
blublublu         email@email.com         phonenumber         subscribed
nanananana        customer@email.com      phonenumber         unsubscribed

yet it wouldn't output the other two lines -

 user3:phonenumber3:blablarandom@bla.com:last-active:never
 useruser          noemailinput@noemail.com       phonenumber      pending

because no string is within that line.


回答1:


The file list.txt probably contains empty lines or some of the separators. When I added : to list.txt, all the lines from the first sample started to match. Similarly, adding a space made all the lines from the second sample match. Adding @ causes the same symptoms.

Try running grep -oFf ... (if your grep supports -o) to see the exact matching parts. If there are empty lines in list.txt, the number of matches will be less than the number of matches without -o. Try searching the output of -o for extremely short outputs to check for suspicious strings. You can also examine the shortest lines in list.txt.

while read line ; do echo ${#line} "$line" ; done < list.txt | sort -nk1,1



回答2:


I think your file list.txt may have blank lines in it, causing it to match every line in the files specified with C:/data/*.txt. To fix you can either manually delete every empty line or run the command sed -i '/^$/d' list.txt where the -i flag edits the file in place.

The issue may also be related to dos carriage returns, try running: cat -v list.txt and checking if the lines are followed by ^M:

email@email.com^M
customer@email.com^M

If this is the case you will need to amend the file using either dos2unix or tr -d '\r' < list.txt > output.txt.



来源:https://stackoverflow.com/questions/48132274/grep-ff-producing-invalid-output

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!