How can I convert tabs to spaces in every file of a directory (possibly recursively)?
Also, is there a way of setting the number of spaces per tab?
Warning: This will break your repo.
This will corrupt binary files, including those under
svn
,.git
! Read the comments before using!
find . -iname '*.java' -type f -exec sed -i.orig 's/\t/ /g' {} +
The original file is saved as [filename].orig
.
Replace '*.java' with the file ending of the file type you are looking for. This way you can prevent accidental corruption of binary files.
Downsides:
No body mentioned rpl
? Using rpl you can replace any string.
To convert tabs to spaces,
rpl -R -e "\t" " " .
very simple.
Collecting the best comments from Gene's answer, the best solution by far, is by using sponge
from moreutils.
sudo apt-get install moreutils
# The complete one-liner:
find ./ -iname '*.java' -type f -exec bash -c 'expand -t 4 "$0" | sponge "$0"' {} \;
Explanation:
./
is recursively searching from current directory-iname
is a case insensitive match (for both *.java
and *.JAVA
likes)type -f
finds only regular files (no directories, binaries or symlinks)-exec bash -c
execute following commands in a subshell for each file name, {}
expand -t 4
expands all TABs to 4 spacessponge
soak up standard input (from expand
) and write to a file (the same one)*. NOTE: * A simple file redirection (> "$0"
) won't work here because it would overwrite the file too soon.
Advantage: All original file permissions are retained and no intermediate tmp
files are used.
How can I convert tabs to spaces in every file of a directory (possibly recursively)?
This is usually not what you want.
Do you want to do this for png images? PDF files? The .git directory? Your
Makefile
(which requires tabs)? A 5GB SQL dump?
You could, in theory, pass a whole lot of exlude options to find
or whatever
else you're using; but this is fragile, and will break as soon as you add other
binary files.
What you want, is at least:
expand
does this, sed
doesn't).As far as I know, there is no "standard" Unix utility that can do this, and it's not very easy to do with a shell one-liner, so a script is needed.
A while ago I created a little script called
sanitize_files which does exactly
that. It also fixes some other common stuff like replacing \r\n
with \n
,
adding a trailing \n
, etc.
You can find a simplified script without the extra features and command-line arguments below, but I recommend you use the above script as it's more likely to receive bugfixes and other updated than this post.
I would also like to point out, in response to some of the other answers here,
that using shell globbing is not a robust way of doing this, because sooner
or later you'll end up with more files than will fit in ARG_MAX
(on modern
Linux systems it's 128k, which may seem a lot, but sooner or later it's not
enough).
#!/usr/bin/env python
#
# http://code.arp242.net/sanitize_files
#
import os, re, sys
def is_binary(data):
return data.find(b'\000') >= 0
def should_ignore(path):
keep = [
# VCS systems
'.git/', '.hg/' '.svn/' 'CVS/',
# These files have significant whitespace/tabs, and cannot be edited
# safely
# TODO: there are probably more of these files..
'Makefile', 'BSDmakefile', 'GNUmakefile', 'Gemfile.lock'
]
for k in keep:
if '/%s' % k in path:
return True
return False
def run(files):
indent_find = b'\t'
indent_replace = b' ' * indent_width
for f in files:
if should_ignore(f):
print('Ignoring %s' % f)
continue
try:
size = os.stat(f).st_size
# Unresolvable symlink, just ignore those
except FileNotFoundError as exc:
print('%s is unresolvable, skipping (%s)' % (f, exc))
continue
if size == 0: continue
if size > 1024 ** 2:
print("Skipping `%s' because it's over 1MiB" % f)
continue
try:
data = open(f, 'rb').read()
except (OSError, PermissionError) as exc:
print("Error: Unable to read `%s': %s" % (f, exc))
continue
if is_binary(data):
print("Skipping `%s' because it looks binary" % f)
continue
data = data.split(b'\n')
fixed_indent = False
for i, line in enumerate(data):
# Fix indentation
repl_count = 0
while line.startswith(indent_find):
fixed_indent = True
repl_count += 1
line = line.replace(indent_find, b'', 1)
if repl_count > 0:
line = indent_replace * repl_count + line
data = list(filter(lambda x: x is not None, data))
try:
open(f, 'wb').write(b'\n'.join(data))
except (OSError, PermissionError) as exc:
print("Error: Unable to write to `%s': %s" % (f, exc))
if __name__ == '__main__':
allfiles = []
for root, dirs, files in os.walk(os.getcwd()):
for f in files:
p = '%s/%s' % (root, f)
if do_add:
allfiles.append(p)
run(allfiles)
Converting tabs to space in just in ".lua" files [tabs -> 2 spaces]
find . -iname "*.lua" -exec sed -i "s#\t# #g" '{}' \;
Try the command line tool expand.
expand -i -t 4 input | sponge output
where
-i
is used to expand only leading tabs on each line;-t 4
means that each tab will be converted to 4 whitespace chars (8 by default).Finally, you can use gexpand
on OSX, after installing coreutils
with Homebrew (brew install coreutils
).