Using grep in python

╄→гoц情女王★ 提交于 2020-05-13 04:35:44

问题


There is a file (query.txt) which has some keywords/phrases which are to be matched with other files using grep. The last three lines of the following code are working perfectly but when the same command is used inside the while loop it goes into an infinite loop or something(ie doesn't respond).

import os

f=open('query.txt','r')
b=f.readline()
while b:
    cmd='grep %s my2.txt'%b    #my2 is the file in which we are looking for b
    os.system(cmd)
    b=f.readline()
f.close()

a='He is'
cmd='grep %s my2.txt'%a
os.system(cmd)

回答1:


First of all, you are not iterating over the file properly. You can simply use for b in f: without the .readline() stuff.

Then your code will blow in your face as soon as the filename contains any characters which have a special meaning in the shell. Use subprocess.call instead of os.system() and pass an argument list.

Here's a fixed version:

import os
import subprocess
with open('query.txt', 'r') as f:
    for line in f:
        line = line.rstrip() # remove trailing whitespace such as '\n'
        subprocess.call(['/bin/grep', line, 'my2.txt'])

However, you can improve your code even more by not calling grep at all. Read my2.txt to a string instead and then use the re module to perform the search. In case you do not need a regex at all, you can even simply use if line in my2_content




回答2:


Your code scans the whole my2.txt file for each query in query.txt.

You want to:

  1. read all queries into a list
  2. iterate once over all lines of the text file and check each file against all queries.

Try this code:

with open('query.txt','r') as f:
    queries = [l.strip() for l in f]

with open('my2.txt','r') as f:
    for line in f:
        for query in queries:
            if query in line:
                print query, line



回答3:


This isn't actually a good way to use Python, but if you have to do something like that, then do it correctly:

from __future__ import with_statement
import subprocess

def grep_lines(filename, query_filename):
    with open(query_filename, "rb") as myfile:
        for line in myfile:
             subprocess.call(["/bin/grep", line.strip(), filename])

grep_lines("my2.txt", "query.txt")

And hope that your file doesn't contain any characters which have special meanings in regular expressions =)

Also, you might be able to do this with grep alone:

grep -f query.txt my2.txt

It works like this:

~ $ cat my2.txt 
One two
two two
two three
~ $ cat query.txt 
two two
three
~ $ python bar.py 
two two
two three



回答4:


$ grep -wFf query.txt my2.txt > out.txt

this will match all the keywords in query.txt with my2.txt file and save the output in out.txt

Read man grep for a description of all the possible arguments.



来源:https://stackoverflow.com/questions/9018109/using-grep-in-python

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!