Is there a simple shell command/script that supports excluding certain files/folders from being archived?
I have a directory that need to be archived with a sub dire
If you are trying to exclude Version Control System (VCS) files, tar already supports two interesting options about it! :)
This option excludes files and directories used by following version control systems: CVS
, RCS
, SCCS
, SVN
, Arch
, Bazaar
, Mercurial
, and Darcs
.
As of version 1.32, the following files are excluded:
CVS/
, and everything under itRCS/
, and everything under itSCCS/
, and everything under it.git/
, and everything under it.gitignore
.gitmodules
.gitattributes
.cvsignore
.svn/
, and everything under it.arch-ids/
, and everything under it{arch}/
, and everything under it=RELEASE-ID
=meta-update
=update
.bzr
.bzrignore
.bzrtags
.hg
.hgignore
.hgrags
_darcs
When archiving directories that are under some version control system (VCS), it is often convenient to read exclusion patterns from this VCS' ignore files (e.g. .cvsignore
, .gitignore
, etc.) This option provide such possibility.
Before archiving a directory, see if it contains any of the following files: cvsignore
, .gitignore
, .bzrignore
, or .hgignore
. If so, read ignore patterns from these files.
The patterns are treated much as the corresponding VCS would treat them, i.e.:
.cvsignore
Contains shell-style globbing patterns that apply only to the directory where this file resides. No comments are allowed in the file. Empty lines are ignored.
.gitignore
Contains shell-style globbing patterns. Applies to the directory where .gitfile
is located and all its subdirectories.
Any line beginning with a #
is a comment. Backslash escapes the comment character.
.bzrignore
Contains shell globbing-patterns and regular expressions (if prefixed with RE:
(16). Patterns affect the directory and all its subdirectories.
Any line beginning with a #
is a comment.
.hgignore
Contains posix regular expressions(17). The line syntax: glob
switches to shell globbing patterns. The line syntax: regexp
switches back. Comments begin with a #
. Patterns affect the directory and all its subdirectories.
tar -czv --exclude-vcs --exclude-vcs-ignores -f path/to/my-tar-file.tar.gz path/to/my/project/
I want to have fresh front-end version (angular folder) on localhost. Also, git folder is huge in my case, and I want to exclude it. I need to download it from server, and unpack it in order to run application.
Compress angular folder from /var/lib/tomcat7/webapps, move it to /tmp folder with name angular.23.12.19.tar.gz
Command :
tar --exclude='.git' -zcvf /tmp/angular.23.12.19.tar.gz /var/lib/tomcat7/webapps/angular/
To avoid possible 'xargs: Argument list too long'
errors due to the use of find ... | xargs ...
when processing tens of thousands of files, you can pipe the output of find
directly to tar
using find ... -print0 | tar --null ...
.
# archive a given directory, but exclude various files & directories
# specified by their full file paths
find "$(pwd -P)" -type d \( -path '/path/to/dir1' -or -path '/path/to/dir2' \) -prune \
-or -not \( -path '/path/to/file1' -or -path '/path/to/file2' \) -print0 |
gnutar --null --no-recursion -czf archive.tar.gz --files-from -
#bsdtar --null -n -czf archive.tar.gz -T -
The following bash script should do the trick. It uses the answer given here by Marcus Sundman.
#!/bin/bash
echo -n "Please enter the name of the tar file you wish to create with out extension "
read nam
echo -n "Please enter the path to the directories to tar "
read pathin
echo tar -czvf $nam.tar.gz
excludes=`find $pathin -iname "*.CC" -exec echo "--exclude \'{}\'" \;|xargs`
echo $pathin
echo tar -czvf $nam.tar.gz $excludes $pathin
This will print out the command you need and you can just copy and paste it back in. There is probably a more elegant way to provide it directly to the command line.
Just change *.CC for any other common extension, file name or regex you want to exclude and this should still work.
EDIT
Just to add a little explanation; find generates a list of files matching the chosen regex (in this case *.CC). This list is passed via xargs to the echo command. This prints --exclude 'one entry from the list'. The slashes () are escape characters for the ' marks.