Quickly replace first line of large file

前端未结

关注

 2  893

孤独总比滥情好 2021-02-19 18:02

I have many large csv files (1-10 gb each) which I\'m importing into databases. For each file, I need to replace the 1st line so I can format the headers to be the column names.

2条回答

有刺的猬 (楼主)

2021-02-19 18:57

The only thing that can significantly speed it up is if you can really replace first line. If new first line is no longer than old one - replace (with space padding if needed) the first line carefully.

Otherwise - you have to create new file and copy the rest after first line. You may be able to optimize copying a bit by adjusting buffer sizes/explicit copy as binary/per-allocating size, but it will not change the fact that you need to copy whole file.

One more cheat if you planning to drop CSV data into DB anyway: if order does not matter you can read some lines from the beginning, replace them with new header and add the removed lines to the end of the file.

Side note: if this is one-time operation I'd simply copy files and be done with it... Debugging code that inserts data into middle of text file with potentially different encoding may not worth an effort.

0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...