Line endings messed up in Git - how to track changes from another branch after a huge line ending fix?

后端 未结 5 875
遇见更好的自我
遇见更好的自我 2020-12-02 08:02

We are working with a 3rd party PHP engine that gets regular updates. The releases are kept on a separate branch in git, and our fork is the master branch.

This way

相关标签:
5条回答
  • 2020-12-02 08:34

    I finally managed to solve it.

    The answer is:

    git filter-branch --tree-filter '~/Scripts/fix-line-endings.sh' -- --all
    

    fix-line-endings.sh contains:

    #!/bin/sh
    find . -type f -a \( -name '*.tpl' -o -name '*.php' -o -name '*.js' -o -name '*.css' -o -name '*.sh' -o -name '*.txt' -iname '*.html' \) | xargs fromdos
    

    After all line endings were fixed in all trees in all commits, I did an interactive rebase and removed all commits that were fixing line endings.

    Now my repo is clean and fresh, ready to be pushed :)

    Note to visitors: do not do this if your repo has been pushed / cloned because it will mess things up badly!

    0 讨论(0)
  • 2020-12-02 08:34

    we are avoiding this problem in the future with:

    1) everyone uses an editor which strips trailing whitespaces, and we save all files with LF.

    2) if 1) fails (it can - someone accidentally saves it in CRLF for whatever reason) we have a pre-commit script that checks for CRLF chars:

    #!/bin/sh
    #
    # An example hook script to verify what is about to be committed.
    # Called by git-commit with no arguments.  The hook should
    # exit with non-zero status after issuing an appropriate message if
    # it wants to stop the commit.
    #
    # To enable this hook, rename this file to "pre-commit" and set executable bit
    
    # original by Junio C Hamano
    
    # modified by Barnabas Debreceni to disallow CR characters in commits
    
    
    if git rev-parse --verify HEAD 2>/dev/null
    then
        against=HEAD
    else
        # Initial commit: diff against an empty tree object
        against=4b825dc642cb6eb9a060e54bf8d69288fbee4904
    fi
    
    crlf=0
    
    IFS="
    "
    for FILE in `git diff-index --cached $against`
    do
        fhash=`echo $FILE | cut -d' ' -f4`
        fname=`echo $FILE | cut -f2`
    
        if git show $fhash | grep -EUIlq $'\r$'
        then
            echo $fname contains CRLF characters
            crlf=1
        fi
    done
    
    if [ $crlf -eq 1 ]
    then
        echo Some files have CRLF line endings. Please fix it to be LF and try committing again.
        exit 1
    fi
    
    exec git diff-index --check --cached $against --
    

    This script uses GNU grep, and works on Mac OS X, however it should be tested before use on other platforms (we had problems with Cygwin and BSD grep)

    3) In case we find any whitespace errors, we use the following script on erroneous files:

    #!/usr/bin/env php
    <?php
    
        // Remove various whitespace errors and convert to LF from CRLF line endings
        // written by Barnabas Debreceni
        // licensed under the terms of WFTPL (http://en.wikipedia.org/wiki/WTFPL)
    
        // handle no args
        if( $argc <2 ) die( "nothing to do" );
    
    
        // blacklist
    
        $bl = array( 'smarty' . DIRECTORY_SEPARATOR . 'templates_c' . DIRECTORY_SEPARATOR . '.*' );
    
        // whitelist
    
        $wl = array(    '\.tpl', '\.php', '\.inc', '\.js', '\.css', '\.sh', '\.html', '\.txt', '\.htc', '\.afm',
                        '\.cfm', '\.cfc', '\.asp', '\.aspx', '\.ascx' ,'\.lasso', '\.py', '\.afp', '\.xml',
                        '\.htm', '\.sql', '\.as', '\.mxml', '\.ini', '\.yaml', '\.yml'  );
    
        // remove $argv[0]
        array_shift( $argv );
    
        // make file list
        $files = getFileList( $argv );
    
        // sort files
        sort( $files );
    
        // filter them for blacklist and whitelist entries
    
        $filtered = preg_grep( '#(' . implode( '|', $wl ) . ')$#', $files );
        $filtered = preg_grep( '#(' . implode( '|', $bl ) . ')$#', $filtered, PREG_GREP_INVERT );
    
        // fix whitespace errors
        fix_whitespace_errors( $filtered );
    
    
    
    
    
        ///////////////////////////////////////////////////////////////////////////////////////////////
        ///////////////////////////////////////////////////////////////////////////////////////////////
    
    
        // whitespace error fixer
        function fix_whitespace_errors( $files ) {
            foreach( $files as $file ) {
    
                // read in file
                $rawlines = file_get_contents( $file );
    
                // remove \r
                $lines = preg_replace( "/(\r\n)|(\n\r)/m", "\n", $rawlines );
                $lines = preg_replace( "/\r/m", "\n", $lines );
    
                // remove spaces from before tabs
                $lines = preg_replace( "/\040+\t/m", "\t", $lines );
    
                // remove spaces from line endings
                $lines = preg_replace( "/[\040\t]+$/m", "", $lines );
    
                // remove tabs from line endings
                $lines = preg_replace( "/\t+$/m", "", $lines );
    
                // remove EOF newlines
                $lines = preg_replace( "/\n+$/", "", $lines );
    
                // write file if changed and set old permissions
                if( strlen( $lines ) != strlen( $rawlines )){
    
                    $perms = fileperms( $file );
    
                    // Uncomment to save original files
    
                    //rename( $file, $file.".old" );
                    file_put_contents( $file, $lines);
                    chmod( $file, $perms );
                    echo "${file}: FIXED\n";
                } else {
                    echo "${file}: unchanged\n";
                }
    
            }
        }
    
        // get file list from argument array
        function getFileList( $argv ) {
            $files = array();
            foreach( $argv as $arg ) {
              // is a direcrtory
                if( is_dir( $arg ) )  {
                    $files = array_merge( $files, getDirectoryTree( $arg ) );
                }
                // is a file
                if( is_file( $arg ) ) {
                    $files[] = $arg;
                }
            }
            return $files;
        }
    
        // recursively scan directory
        function getDirectoryTree( $outerDir ){
            $outerDir = preg_replace( ':' . DIRECTORY_SEPARATOR . '$:', '', $outerDir );
            $dirs = array_diff( scandir( $outerDir ), array( ".", ".." ) );
            $dir_array = array();
            foreach( $dirs as $d ){
                if( is_dir( $outerDir . DIRECTORY_SEPARATOR . $d ) ) {
                    $otherdir = getDirectoryTree( $outerDir . DIRECTORY_SEPARATOR . $d );
                    $dir_array = array_merge( $dir_array, $otherdir );
                }
                else $dir_array[] = $outerDir . DIRECTORY_SEPARATOR . $d;
            }
            return $dir_array;
        }
    ?>
    
    0 讨论(0)
  • 2020-12-02 08:41

    One solution (not necessarily the best one) would be to use git-filter-branch to rewrite history to always use correct line endings. This should be better solution that interactive rebase, at least for larger number of commits; also it might be easier to deal with merges using git-filter-branch.

    That is of course assuming that history was not published (repository was not cloned).

    0 讨论(0)
  • 2020-12-02 08:48

    Going forward, avoid this problem with the core.autocrlf setting, documented in git config --help:

    core.autocrlf

    If true, makes git convert CRLF at the end of lines in text files to LF when reading from the filesystem, and convert in reverse when writing to the filesystem. The variable can be set to input, in which case the conversion happens only while reading from the filesystem but files are written out with LF at the end of lines. A file is considered "text" (i.e. be subjected to the autocrlf mechanism) based on the file's crlf attribute, or if crlf is unspecified, based on the file's contents. See gitattributes.

    0 讨论(0)
  • 2020-12-02 08:52

    Did you look at git rebase?

    You will need to re-base the history of your repository, as follows:

    • commit the line terminator fixes
    • start the rebase
    • leave the third-party import commit first
    • apply the line terminator fixes
    • apply your other patches

    What you do need to understand though is that this will break all downstream repositories - those that are cloned from your parent repo. Ideally you will start from scratch with those.


    Update: sample usage:

    target=`git rev-list --max-count=3 HEAD | tail -n1`
    get rebase -i $target
    

    Will start a rebase session for the last 3 commits.

    0 讨论(0)
提交回复
热议问题