问题
I am new to python, and although I am sure this might be a trivial question, I have spent my day trying to solve this in different ways. I have a file containing data that looks like this:
<string>
<integer>
<N1>
<N2>
data
data
...
<string>
<integer>
<N3>
<N4>
data
data
...
And that extends a number of times... I need to read the "data" which for the first set (between the first and second ) contains a number N1 of X points, a number N2 of Y points and a number N1*N2 of Z points. If I had only one set of data I already know how to read all the data, then read the value N1, N2, then slice it into X, Y and Z, reshape it and use it... but if my file contains more than one sets of data, how do I read only from one string until the next one, and then repeat the same operation for the next set, and again until I reach the end of the file? I tried defining a function like:
def dat_fun():
with open("inpfile.txt", "r") as ifile:
for line in ifile:
if isinstance('line', str) or (not line):
break
for line in ifile:
yield line
but is not working, I get arrays with no data on them. Any comments will be appreciated. Thanks!
回答1:
All lines are instances of str
, so you break out on the first line. Remove that test, and test for an empty line by stripping away whitespace first:
def dat_fun():
with open("inpfile.txt", "r") as ifile:
for line in ifile:
if not line.strip():
break
yield line
I don't think you need to break at an empty line, really; the for
loop ends on its own at the end of the file.
If your lines contain other sorts of data, you'd need to do the conversion yourself, coming from string.
回答2:
With structured data like this, I'd suggest just reading what you need. For example:
with open("inpfile.txt", "r") as ifile:
first_string = ifile.readline().strip() # Is this the name of the data set?
first_integer = int(ifile.readline()) # You haven't told us what this is, either
n_one = int(ifile.readline())
n_two = int(ifile.readline())
x_vals = []
y_vals = []
z_vals = []
for index in range(n_one):
x_vals.append(ifile.readline().strip())
for index in range(n_two):
y_vals.append(ifile.readline().strip())
for index in range(n_one*n_two):
z_vals.append(ifile.readline().strip())
You can turn this into a dataset generating function by adding a loop and yielding the values:
with open("inpfile.txt", "r") as ifile:
while True:
first_string = ifile.readline().strip() # Is this the name of the data set?
if first_string == '':
break
first_integer = int(ifile.readline()) # You haven't told us what this is, either
n_one = int(ifile.readline())
n_two = int(ifile.readline())
x_vals = []
y_vals = []
z_vals = []
for index in range(n_one):
x_vals.append(ifile.readline().strip())
for index in range(n_two):
y_vals.append(ifile.readline().strip())
for index in range(n_one*n_two):
z_vals.append(ifile.readline().strip())
yield (x_vals, y_vals, z_vals) # and the first string and integer if you need those
回答3:
def dat_fun():
with open("inpfile.txt", "r") as ifile:
for line in ifile:
if isinstance('line', str) or (not line): # 'line' is always a str, and so is the line itself
break
for line in ifile:
yield line
Change this to:
def dat_fun():
with open("inpfile.txt", "r") as ifile:
for line in ifile:
if not line:
break
yield line
来源:https://stackoverflow.com/questions/17436709/python-loop-through-a-text-file-reading-data