I use importing systems based on delimited text files. The files used can sometimes be almost 2 Gb big and I have to check some lines from that file. So I want to know how ca
You can do this with many Unix tools, for instance with awk
:
# print first 5 lines with awk
awk 'NR>=1&&NR<=5{print}NR>=6{exit}' file
# print selection of lines
awk 'NR==994123||NR==1002451||NR==1010123{print}NR>1010123{exit}' file
In python:
readThisFile = open('YOURFILE')
outputFile = open('OUTPUT', w)
for actualline, linetext in enumerate(readThisFile):
if actualline == WANTEDLINE
outputFile.write(linetext)
else:
pass
If wanted you can modify that script to work with arguments (like getline.py 1234)
To print line N
, use:
sed 'Nq;d' file
To print multiple lines (assuming they are in ascending order) e.g. 994123, 1002451, 1010123:
sed '994123p;1002451p;1010123q;d' file
The q
after the last line number tells sed
to quit when it reaches the 1010123th line, instead of wasting time by looping over the remaining lines that we are not interested in. That is why it is efficient on large files.