Split File - Java/Linux

前端未结

关注

 4  1807

情歌与酒 2021-01-21 03:12

I have a large file contains nearly 250 million characters. Now, I want to split it into parts of each contains 30 million characters ( so first 8 parts will contains 30 million

4条回答

天涯浪人 (楼主)

2021-01-21 03:27
One way is to use regular unix commands to split the file and the prepend the last 1000 bytes from the previous file.

First split the file:
```
split -b 30000000 inputfile part.
```
Then, for each part (ignoring the farst make a new file starting with the last 1000 bytes from the previous:
```
unset prev
for i in part.*
do if [ -n "${prev}" ]
  then 
    tail -c 1000 ${prev} > part.temp
    cat ${i} >> part.temp
    mv part.temp ${i}
  fi
  prev=${i}
done
```
Before assembling we again iterate over the files, ignoring the first and throw away the first 1000 bytes:
```
unset prev
for i in part.*
do if [ -n "${prev}" ]
  then 
    tail -c +1001 ${i} > part.temp
    mv part.temp ${i}
  fi
  prev=${i}
done
```
Last step is to reassemble the files:
```
cat part.* >> newfile
```
Since there was no explanation of why the overlap was needed I just created it and then threw it away.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...