How to close file in awk while generating a list of?

笑着哭i 提交于 2021-02-08 10:18:37

问题


Guys I'm trying to find a way to don't have the awk error "too many open file" . Here's my situation:

INPUT : ASCII file, lot of line, with this scheme:

NODE_212_lenght.._1
NODE_212_lenght.._2
NODE_213_lenght.._1
NODE_213_lenght.._2

In order to split this file with every record with the same NODE number, I've used this one-liner awk command

awk -F "_" '{print >("orfs_for_node_" $2 "")}' <file

With a file composed by lots of lines, this command keeps sayin "too many open files" . I've tried also by splitting by 2k lines, same. I can't actually go under 2k lines, because the input one is a huge file.

I know awk could close a file after doing something inside, but I don't know actually how to do that. I've tried adding

awk -F "_" '{print >("orfs_for_node_" $2 ""); close(orfs_for_node_*)}' <file 

but this will make no output.


回答1:


If you switch to GNU awk that'll handle it for you. Otherwise this is the right syntax if your input file has all the lines for each $2 value grouped together:

awk -F '_' '{out="orfs_for_node_"$2} out!=prev{close(prev)} {print > out; prev=out}' file

otherwise you need to use >> instead of >:

awk -F '_' '{out="orfs_for_node_"$2} out!=prev{close(prev)} {print >> out; prev=out}' file

Note that in that second case you'd need to empty any pre-existing "out" files (e.g. from a previous run) before running it since it'll always append to the output files.




回答2:


From my understanding, you are looking for the right moment to close the file. For your example input content, you can do :

awk -F "_" 'BEGIN{prefix="orfs_for_node_"} 
NR>1&&$2!=last{close(prefix""last)}{last=$2;print >(prefix$2)}' inputFile

It checks the $2 if it changed, then close the file with last $2. This assumes that the lines in your file are sorted by $2

If it is not sorted by $2 use >>



来源:https://stackoverflow.com/questions/51209508/how-to-close-file-in-awk-while-generating-a-list-of

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!