How to concatenate files with the same prefix (and many prefixes)?

后端未结

关注

 4  823

I have many files that have the same prefix, only the bit after the underscore is different. And I have many prefixes as well! Underscore does not appear anywhere else in th

相关标签:

4条回答

长情又很酷

2021-01-15 23:49
You can do something like:
```
cat /path/prefix* >> new_file
```
It will cat (that is, concatenate files and print on the standard output) all files whose name matches /path/prefix. The rest of the text is what can be different.

Before executing that it is good to do ls /path/prefix* to make sure it gets all (and only these) files you want to take into consideration.

Example
```
$ ls
aa_bb  prefix_23  prefix_235  prefix_nnn
$ ls prefix_*
prefix_23  prefix_235  prefix_nnn
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
渐次进展

2021-01-15 23:53
I had a similar problem, had many files and wanted to group and cat them by prefix, I used this little script:
```
ls | awk -F '_' '!x[$1]++{print $1}' | while read -r line
do
    cat $line* > all_$line\.txt
done
```
ls will show all the files in the directory

In awk the -F '_' option is to set the underscore as the delimiter, and the code itself acts like uniq, meaning will print each prefix only once.

Then we run a loop on all prefixes and cat all the files with the same prefix.
0 讨论(0)
发布评论:

提交评论
- 加载中...
遥遥无期

2021-01-16 00:01
In case your amount of files is very large, then sometimes just using shell globbing (prefix_* and the like) isn't suitable.

You can use a loop and append them one by one then:
```
find dir -type f -name 'prefix_*' -exec bash -c 'cat "{}" >> result' \;
```
This will append all files matching prefix_* one by one to the file result (which shouldn't exist in the beginning, if in doubt use rm result).

If you have lots of different prefixes, you can of course append one group after the other without removing the result file in between.

All the other options the Unix tool find offers can be used as well of course. But if you need help with that, feel free to ask again.
0 讨论(0)
发布评论:

提交评论
- 加载中...
予麋鹿

2021-01-16 00:09
I had to do something very similar and I don't feel like the previous answers here solve your problem as they require a huge amount of manual input if there are many different prefixes, not just a few prefixes with lots of files all with the same prefix. If I knew the pattern of your prefix I could give you more specific advice, but for now I'm just going to assume that your prefix is numbering with leading zeros (as it is with my files). I am going to assume the following, but they need not be true to work:
```
~/test01/001-test.txt
~/test01/002-test.txt
~/test01/003-test.txt

~/test02/001-test.txt
~/test02/002-test.txt
~/test02/003-test.txt
```
Once this is set up I'm going to change into a merge directory where I want all my merged files to be written to and then run the cat command in a for loop.
```
cd ~/merge

for i in {001..003}; do cat ../test*/"$i"*.txt > "$i"-merge.txt ; done
```
This will use 001, 002, and 003 as prefixes and look in all of the test directories for files that match these prefixes and merge them together in the order they're found. The end result will appear in:
```
~/merge/001-merge.txt
~/merge/002-merge.txt
~/merge/003-merge.txt
```
I know this is a lot late, but hopefully it helps someone else. I have to do this with 5000 prefixes, so I completely understand.
0 讨论(0)
发布评论:

提交评论
- 加载中...

How to concatenate files with the same prefix (and many prefixes)?

Example