Quickly replace first line of large file

前端 未结 2 843
孤独总比滥情好
孤独总比滥情好 2021-02-19 18:02

I have many large csv files (1-10 gb each) which I\'m importing into databases. For each file, I need to replace the 1st line so I can format the headers to be the column names.

2条回答
  •  有刺的猬
    2021-02-19 18:57

    The only thing that can significantly speed it up is if you can really replace first line. If new first line is no longer than old one - replace (with space padding if needed) the first line carefully.

    Otherwise - you have to create new file and copy the rest after first line. You may be able to optimize copying a bit by adjusting buffer sizes/explicit copy as binary/per-allocating size, but it will not change the fact that you need to copy whole file.

    One more cheat if you planning to drop CSV data into DB anyway: if order does not matter you can read some lines from the beginning, replace them with new header and add the removed lines to the end of the file.

    Side note: if this is one-time operation I'd simply copy files and be done with it... Debugging code that inserts data into middle of text file with potentially different encoding may not worth an effort.

提交回复
热议问题