A Python program I\'m writing is to read a set number of lines from the top of a file, and the program needs to preserve this header for future use. Currently, I\'m doing s
My best answer is as follows:
file test.dat:
This is line 1
This is line 2
This is line 3
This is line 4
This is line 5
This is line 6
This is line 7
This is line 8
This is line 9
Python script:
f = open('test.dat')
nlines = 4
header = "".join(f.readline() for _ in range(nlines))
Output:
>>> header
'This is line 1\nThis is line 2\nThis is line 3\nThis is line 4\n'
Notice that you don't need to call any modules; also that you could use any dummy variable in place of _
(it works with i
, or j
, or ni
, or whatever) but I recomend you don't (to avoid confusion). You could strip the newline characters (though I don't recommend you do - this way you can distinguish among lines) or do anything that you can do with strings in Python.
Notice that I did not provide a mode for opening the file, so it defaults to "read only" - this is not Pythonic; in Python "explicit is better than implicit". Finally, nice people close their files; in this case it is automatic (because the script ends) but it is best practice to close them using f.close()
.
Happy Pythoning.
Edit: As pointed out by Roger Pate the square brackets are unnecessary in the list comprehension, thereby reducing the line by two characters. The original script has been edited to reflect this.
I'm not sure what the Pylint rules are, but you could use the '_' throwaway variable name.
header = ''
header_len = 4
for _ in range(1, header_len):
header += file_handle.readline()
I do not see any thing wrong with your solution, may be just replace i with _, I also do not like invoking itertools everywhere where simpler solution will work, it is like people using jQuery for trivial javascript tasks. anyway just to have itertools revenge here is my solution
as you want to read whole file anyway line by line, why not just first read header and after that do whatever you want to do
header = ''
header_len = 4
for i, line in enumerate(file_handle):
if i < header_len:
header += line
else:
# output chunks to separate files
pass
print header
One problem with using _ as a dummy variable is that it only solves the problem on one level, consider something like the following.
def f(n, m):
"""A function to run g() n times and run h() m times per g."""
for _ in range(n):
g()
for _ in range(m):
h()
return 0
This function works fine but the _ iterator over m runs is problematic as it may conflict with the upper _. In any case PyCharm is complaining about this kind of syntax.
So I would argue that _ is not as "throwaway" as was suggested before.
Perhaps you might like to just create a function to do it!
def run(f, n, *args):
"""Runs f with the arguments from the args tuple n times."""
for _ in range(n):
f(*args)
e.g. you could use it like this:
>>> def ft(x, L):
... L.append(x)
>>> a = 7
>>> nums = [4, 1]
>>> run(ft, 10, a, nums)
>>> nums
[4, 1, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7]
What about:
header = []
for i,l in enumerate(file_handle):
if i <= 3:
header += l
continue
#proc rest of file here
May be this:
header_len = 4
header = open("file.txt").readlines()[:header_len]
But, it will be troublesome for long files.