问题
Let's say I have an input text file of the following format:
Section1 Heading Number of lines: n1
Line 1
Line 2
...
Line n1
Maybe some irrelevant lines
Section2 Heading Number of lines: n2
Line 1
Line 2
...
Line n2
where certain sections of the file start with a header line that specifies how many lines are in that section. Each section heading has a different name.
I have written a regular expression that will match the header line based on the header name the user searches for each section, parse it, and then return the number n1/n2/etc that tells me how many lines are in the section. I have been trying to use a for-in loop to read through each line until a counter reaches n1, but it hasn't worked out so far.
Here's my question: how do I return just a certain number of lines following a matched line when that number is given in the match and different for each section? I'm new to programming, and I appreciate any help.
EDIT: Okay, here's the relevant code that I have so far:
import re
print
fname = raw_input("Enter filename: ")
toolname = raw_input("Enter toolname: ")
def findcounter(fname, toolname):
logfile = open(fname, "r")
pat = 'SUCCESS Number of lines :'
#headers all have that format
for line in logfile:
if toolname in line:
if pat in line:
s=line
pattern = re.compile(r"""(?P<name>.*?) #starting name
\s*SUCCESS #whitespace and success
\s*Number\s*of\s*lines #whitespace and strings
\s*\:\s*(?P<n1>.*)""",re.VERBOSE)
match = pattern.match(s)
name = match.group("name")
n1 = int(match.group("n1"))
#after matching line, I attempt to loop through the next n1 lines
lcount = 0
for line in logfile:
if line == match:
while lcount <= n1:
match.append(line)
lcount += 1
return result
The file itself is pretty long, and there are lots of irrelevant lines interspersed between the sections I'm interested in. What I'm not too sure about is how to specify printing the lines directly after a matched line.
回答1:
# f is a file object
# n1 is how many lines to read
lines = [f.readline() for i in range(n1)]
回答2:
You can put logic like this in a generator:
def take(seq, n):
""" gets n items from a sequence """
return [next(seq) for i in range(n)]
def getblocks(lines):
# `it` is a iterator and knows where we are in the list of lines.
it = iter(lines)
for line in it:
try:
# try to find the header:
sec, heading, num = line.split()
num = int(num)
except ValueError:
# didnt work, try the next line
continue
# we got a header, so take the next lines
yield take(it, num)
#test
data = """
Section1 Heading 3
Line 1
Line 2
Line 3
Maybe some irrelevant lines
Section2 Heading 2
Line 1
Line 2
""".splitlines()
print list(getblocks(data))
来源:https://stackoverflow.com/questions/6443392/python-how-to-grab-certain-number-of-lines-after-match