How can I convert tabs to spaces in every file of a directory?

后端 未结 19 1244
既然无缘
既然无缘 2020-12-02 03:48

How can I convert tabs to spaces in every file of a directory (possibly recursively)?

Also, is there a way of setting the number of spaces per tab?

相关标签:
19条回答
  • 2020-12-02 03:59

    Warning: This will break your repo.

    This will corrupt binary files, including those under svn, .git! Read the comments before using!

    find . -iname '*.java' -type f -exec sed -i.orig 's/\t/ /g' {} +

    The original file is saved as [filename].orig.

    Replace '*.java' with the file ending of the file type you are looking for. This way you can prevent accidental corruption of binary files.

    Downsides:

    • Will replace tabs everywhere in a file.
    • Will take a long time if you happen to have a 5GB SQL dump in this directory.
    0 讨论(0)
  • 2020-12-02 04:01

    No body mentioned rpl? Using rpl you can replace any string. To convert tabs to spaces,

    rpl -R -e "\t" "    "  .
    

    very simple.

    0 讨论(0)
  • 2020-12-02 04:05

    Collecting the best comments from Gene's answer, the best solution by far, is by using sponge from moreutils.

    sudo apt-get install moreutils
    # The complete one-liner:
    find ./ -iname '*.java' -type f -exec bash -c 'expand -t 4 "$0" | sponge "$0"' {} \;
    

    Explanation:

    • ./ is recursively searching from current directory
    • -iname is a case insensitive match (for both *.java and *.JAVA likes)
    • type -f finds only regular files (no directories, binaries or symlinks)
    • -exec bash -c execute following commands in a subshell for each file name, {}
    • expand -t 4 expands all TABs to 4 spaces
    • sponge soak up standard input (from expand) and write to a file (the same one)*.

    NOTE: * A simple file redirection (> "$0") won't work here because it would overwrite the file too soon.

    Advantage: All original file permissions are retained and no intermediate tmp files are used.

    0 讨论(0)
  • 2020-12-02 04:06

    How can I convert tabs to spaces in every file of a directory (possibly recursively)?

    This is usually not what you want.

    Do you want to do this for png images? PDF files? The .git directory? Your Makefile (which requires tabs)? A 5GB SQL dump?

    You could, in theory, pass a whole lot of exlude options to find or whatever else you're using; but this is fragile, and will break as soon as you add other binary files.

    What you want, is at least:

    1. Skip files over a certain size.
    2. Detect if a file is binary by checking for the presence of a NULL byte.
    3. Only replace tabs at the start of a file (expand does this, sed doesn't).

    As far as I know, there is no "standard" Unix utility that can do this, and it's not very easy to do with a shell one-liner, so a script is needed.

    A while ago I created a little script called sanitize_files which does exactly that. It also fixes some other common stuff like replacing \r\n with \n, adding a trailing \n, etc.

    You can find a simplified script without the extra features and command-line arguments below, but I recommend you use the above script as it's more likely to receive bugfixes and other updated than this post.

    I would also like to point out, in response to some of the other answers here, that using shell globbing is not a robust way of doing this, because sooner or later you'll end up with more files than will fit in ARG_MAX (on modern Linux systems it's 128k, which may seem a lot, but sooner or later it's not enough).


    #!/usr/bin/env python
    #
    # http://code.arp242.net/sanitize_files
    #
    
    import os, re, sys
    
    
    def is_binary(data):
        return data.find(b'\000') >= 0
    
    
    def should_ignore(path):
        keep = [
            # VCS systems
            '.git/', '.hg/' '.svn/' 'CVS/',
    
            # These files have significant whitespace/tabs, and cannot be edited
            # safely
            # TODO: there are probably more of these files..
            'Makefile', 'BSDmakefile', 'GNUmakefile', 'Gemfile.lock'
        ]
    
        for k in keep:
            if '/%s' % k in path:
                return True
        return False
    
    
    def run(files):
        indent_find = b'\t'
        indent_replace = b'    ' * indent_width
    
        for f in files:
            if should_ignore(f):
                print('Ignoring %s' % f)
                continue
    
            try:
                size = os.stat(f).st_size
            # Unresolvable symlink, just ignore those
            except FileNotFoundError as exc:
                print('%s is unresolvable, skipping (%s)' % (f, exc))
                continue
    
            if size == 0: continue
            if size > 1024 ** 2:
                print("Skipping `%s' because it's over 1MiB" % f)
                continue
    
            try:
                data = open(f, 'rb').read()
            except (OSError, PermissionError) as exc:
                print("Error: Unable to read `%s': %s" % (f, exc))
                continue
    
            if is_binary(data):
                print("Skipping `%s' because it looks binary" % f)
                continue
    
            data = data.split(b'\n')
    
            fixed_indent = False
            for i, line in enumerate(data):
                # Fix indentation
                repl_count = 0
                while line.startswith(indent_find):
                    fixed_indent = True
                    repl_count += 1
                    line = line.replace(indent_find, b'', 1)
    
                if repl_count > 0:
                    line = indent_replace * repl_count + line
    
            data = list(filter(lambda x: x is not None, data))
    
            try:
                open(f, 'wb').write(b'\n'.join(data))
            except (OSError, PermissionError) as exc:
                print("Error: Unable to write to `%s': %s" % (f, exc))
    
    
    if __name__ == '__main__':
        allfiles = []
        for root, dirs, files in os.walk(os.getcwd()):
            for f in files:
                p = '%s/%s' % (root, f)
                if do_add:
                    allfiles.append(p)
    
        run(allfiles)
    
    0 讨论(0)
  • 2020-12-02 04:08

    Converting tabs to space in just in ".lua" files [tabs -> 2 spaces]

    find . -iname "*.lua" -exec sed -i "s#\t#  #g" '{}' \;
    
    0 讨论(0)
  • 2020-12-02 04:09

    Try the command line tool expand.

    expand -i -t 4 input | sponge output
    

    where

    • -i is used to expand only leading tabs on each line;
    • -t 4 means that each tab will be converted to 4 whitespace chars (8 by default).
    • sponge is from the moreutils package, and avoids clearing the input file.

    Finally, you can use gexpand on OSX, after installing coreutils with Homebrew (brew install coreutils).

    0 讨论(0)
提交回复
热议问题