I have many files that have the same prefix, only the bit after the underscore is different. And I have many prefixes as well! Underscore does not appear anywhere else in th
You can do something like:
cat /path/prefix* >> new_file
It will cat
(that is, concatenate files and print on the standard output
) all files whose name matches /path/prefix
. The rest of the text is what can be different.
Before executing that it is good to do ls /path/prefix*
to make sure it gets all (and only these) files you want to take into consideration.
$ ls
aa_bb prefix_23 prefix_235 prefix_nnn
$ ls prefix_*
prefix_23 prefix_235 prefix_nnn
I had a similar problem, had many files and wanted to group and cat
them by prefix, I used this little script:
ls | awk -F '_' '!x[$1]++{print $1}' | while read -r line
do
cat $line* > all_$line\.txt
done
ls
will show all the files in the directory
In awk
the -F '_'
option is to set the underscore as the delimiter, and the code itself acts like uniq, meaning will print each prefix only once.
Then we run a loop on all prefixes and cat
all the files with the same prefix.
In case your amount of files is very large, then sometimes just using shell globbing (prefix_*
and the like) isn't suitable.
You can use a loop and append them one by one then:
find dir -type f -name 'prefix_*' -exec bash -c 'cat "{}" >> result' \;
This will append all files matching prefix_*
one by one to the file result
(which shouldn't exist in the beginning, if in doubt use rm result
).
If you have lots of different prefixes, you can of course append one group after the other without removing the result
file in between.
All the other options the Unix tool find
offers can be used as well of course. But if you need help with that, feel free to ask again.
I had to do something very similar and I don't feel like the previous answers here solve your problem as they require a huge amount of manual input if there are many different prefixes, not just a few prefixes with lots of files all with the same prefix. If I knew the pattern of your prefix I could give you more specific advice, but for now I'm just going to assume that your prefix is numbering with leading zeros (as it is with my files). I am going to assume the following, but they need not be true to work:
~/test01/001-test.txt
~/test01/002-test.txt
~/test01/003-test.txt
~/test02/001-test.txt
~/test02/002-test.txt
~/test02/003-test.txt
Once this is set up I'm going to change into a merge directory where I want all my merged files to be written to and then run the cat command in a for loop.
cd ~/merge
for i in {001..003}; do cat ../test*/"$i"*.txt > "$i"-merge.txt ; done
This will use 001, 002, and 003 as prefixes and look in all of the test directories for files that match these prefixes and merge them together in the order they're found. The end result will appear in:
~/merge/001-merge.txt
~/merge/002-merge.txt
~/merge/003-merge.txt
I know this is a lot late, but hopefully it helps someone else. I have to do this with 5000 prefixes, so I completely understand.