Find files and tar them (with spaces)

后端未结

关注

 10  1700

Alright, so simple problem here. I\'m working on a simple back up code. It works fine except if the files have spaces in them. This is how I\'m finding files and adding th

相关标签:

10条回答

小鲜肉

2020-11-29 16:11
Why not:
```
tar czvf backup.tar.gz *
```
Sure it's clever to use find and then xargs, but you're doing it the hard way.

Update: Porges has commented with a find-option that I think is a better answer than my answer, or the other one: find -print0 ... | xargs -0 ....
0 讨论(0)
发布评论:

提交评论
- 加载中...
[愿得一人]

2020-11-29 16:13

Why not give something like this a try: tar cvf scala.tar `find src -name *.scala`

0 讨论(0)
发布评论:

提交评论
- 加载中...
天命终不由人

2020-11-29 16:14
If you have multiple files or directories and you want to zip them into independent *.gz file you can do this. Optional -type f -atime
```
find -name "httpd-log*.txt" -type f -mtime +1 -exec tar -vzcf {}.gz {} \;
```
This will compress
```
httpd-log01.txt
httpd-log02.txt
```
to
```
httpd-log01.txt.gz
httpd-log02.txt.gz
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
佛祖请我去吃肉

2020-11-29 16:17
Big warning on several of the solutions (and your own test) :

When you do : anything | xargs something

xargs will try to fit "as many arguments as possible" after "something", but then you may end up with multiple invocations of "something".

So your attempt: find ... | xargs tar czvf file.tgz may end up overwriting "file.tgz" at each invocation of "tar" by xargs, and you end up with only the last invocation! (the chosen solution uses a GNU -T special parameter to avoid the problem, but not everyone has that GNU tar available)

You could do instead:
```
find . -type f -print0 | xargs -0 tar -rvf backup.tar
gzip backup.tar
```
Proof of the problem on cygwin:
```
$ mkdir test
$ cd test
$ seq 1 10000 | sed -e "s/^/long_filename_/" | xargs touch 
    # create the files
$ seq 1 10000 | sed -e "s/^/long_filename_/" | xargs tar czvf archive.tgz
    # will invoke tar several time as it can'f fit 10000 long filenames into 1
$ tar tzvf archive.tgz | wc -l
60
    # in my own machine, I end up with only the 60 last filenames, 
    # as the last invocation of tar by xargs overwrote the previous one(s)

# proper way to invoke tar: with -r  (which append to an existing tar file, whereas c would overwrite it)
# caveat: you can't have it compressed (you can't add to a compressed archive)
$ seq 1 10000 | sed -e "s/^/long_filename_/" | xargs tar rvf archive.tar #-r, and without z
$ gzip archive.tar
$ tar tzvf archive.tar.gz | wc -l
10000 
  # we have all our files, despite xargs making several invocations of the tar command

 
```
Note: that behavior of xargs is a well know diccifulty, and it is also why, when someone wants to do :
```
find .... | xargs grep "regex"
```
they intead have to write it:
```
find ..... | xargs grep "regex" /dev/null
```
That way, even if the last invocation of grep by xargs appends only 1 filename, grep sees at least 2 filenames (as each time it has: /dev/null, where it won't find anything, and the filename(s) appended by xargs after it) and thus will always display the file names when something maches "regex". Otherwise you may end up with the last results showing matches without a filename in front.
0 讨论(0)
发布评论:

提交评论
- 加载中...
南旧

2020-11-29 16:20
There could be another way to achieve what you want. Basically,
1. Use the find command to output path to whatever files you're looking for. Redirect stdout to a filename of your choosing.
2. Then tar with the -T option which allows it to take a list of file locations (the one you just created with find!)
```
find . -name "*.whatever" > yourListOfFiles
tar -cvf yourfile.tar -T yourListOfFiles
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

再見小時候

2020-11-29 16:21

The best solution seem to be to create a file list and then archive files because you can use other sources and do something else with the list.

For example this allows using the list to calculate size of the files being archived:

#!/bin/sh

backupFileName="backup-big-$(date +"%Y%m%d-%H%M")"
backupRoot="/var/www"
backupOutPath=""

archivePath=$backupOutPath$backupFileName.tar.gz
listOfFilesPath=$backupOutPath$backupFileName.filelist

#
# Make a list of files/directories to archive
#
echo "" > $listOfFilesPath
echo "${backupRoot}/uploads" >> $listOfFilesPath
echo "${backupRoot}/extra/user/data" >> $listOfFilesPath
find "${backupRoot}/drupal_root/sites/" -name "files" -type d >> $listOfFilesPath

#
# Size calculation
#
sizeForProgress=`
cat $listOfFilesPath | while read nextFile;do
    if [ ! -z "$nextFile" ]; then
        du -sb "$nextFile"
    fi
done | awk '{size+=$1} END {print size}'
`

#
# Archive with progress
#
## simple with dump of all files currently archived
#tar -czvf $archivePath -T $listOfFilesPath
## progress bar
sizeForShow=$(($sizeForProgress/1024/1024))
echo -e "\nRunning backup [source files are $sizeForShow MiB]\n"
tar -cPp -T $listOfFilesPath | pv -s $sizeForProgress | gzip > $archivePath

0 讨论(0)

1 2 下一页