Fastest way to print a single line in a file

前端未结

关注

 5  801

礼貌的吻别 2021-02-02 12:44

I have to fetch one specific line out of a big file (1500000 lines), multiple times in a loop over multiple files, I was asking my self what would be the best option

5条回答

别跟我提以往 (楼主)

2021-02-02 13:34
If you are really just getting the very first line and reading hundreds of files, then consider shell builtins instead of external external commands, use read which is a shell builtin for bash and ksh. This eliminates the overhead of process creation with awk, sed, head, etc.

The other issue is doing timed performance analysis on I/O. The first time you open and then read a file, file data is probably not cached in memory. However, if you try a second command on the same file again, the data as well as the inode have been cached, so the timed results are may be faster, pretty much regardless of the command you use. Plus, inodes can stay cached practically forever. They do on Solaris for example. Or anyway, several days.

For example, linux caches everything and the kitchen sink, which is a good performance attribute. But it makes benchmarking problematic if you are not aware of the issue.

All of this caching effect "interference" is both OS and hardware dependent.

So - pick one file, read it with a command. Now it is cached. Run the same test command several dozen times, this is sampling the effect of the command and child process creation, not your I/O hardware.

this is sed vs read for 10 iterations of getting the first line of the same file, after read the file once:

sed: sed '1{p;q}' uopgenl20121216.lis
```
real    0m0.917s
user    0m0.258s
sys     0m0.492s
```
read: read foo < uopgenl20121216.lis ; export foo; echo "$foo"
```
real    0m0.017s
user    0m0.000s
sys     0m0.015s
```
This is clearly contrived, but does show the difference between builtin performance vs using a command.
0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...