I have to fetch one specific line out of a big file (1500000 lines), multiple times in a loop over multiple files, I was asking my self what would be the best option
I have done extensive testing, and found that, if you want every line of a file:
while IFS=$'\n' read LINE; do
echo "$LINE"
done < your_input.txt
Is much much faster then any other (Bash based) method out there. All other methods (like sed
) read the file each time, at least up to the matching line. If the file is 4 lines long, you will get: 1 -> 1,2 -> 1,2,3 -> 1,2,3,4
= 10
reads whereas the while loop just maintains a position cursor (based on IFS
) so would only do 4
reads in total.
On a file with ~15k lines, the difference is phenomenal: ~25-28 seconds (sed
based, extracting a specific line from each time) versus ~0-1 seconds (while...read
based, reading through the file once)
The above example also shows how to set IFS
in a better way to newline (with thanks to Peter from comments below), and this will hopefully fix some of the other issue seen when using while... read ...
in Bash at times.