I have a large CVS repository containing files in ISO-8859-1
and want to convert this to git.
Sure I can configure git to use ISO-8859-1
for encoding, but I would like to have it in utf8
.
Now with tools such as iconv
or recode
I can convert the encoding for the files in my working tree. I could commit this with a message like converted encoding
.
My question now is, is there a possibility to convert the complete history? Either when converting from cvs to git or afterwards. My idea would be to write a script that reads each commit in the git repository and to convert it to utf8
and to commit it in a new git repository.
Is this possible (I am unsure about the hash codes and how to walk through the commits, branches and tags). Or is there a tool that can handle something like this?
You can do this with git filter-branch
. The idea is that you have to change the encoding of the files in every commit, rewriting each commit as you go.
First, write a script that changes the encoding of every file in the repository. It could look like this:
#!/bin/sh
find . -type f -print | while read f; do
mv -i "$f" "$f.recode.$$"
iconv -f iso-8859-1 -t utf-8 < "$f.recode.$$" > "$f"
rm -f "$f.recode.$$"
done
Then use git filter-branch
to run this script over and over again, once per commit:
git filter-branch --tree-filter /tmp/recode-all-files HEAD
where /tmp/recode-all-files
is the above script.
Right after the repository is freshly upgraded from CVS, you probably have just one branch in git with a linear history back to the beginning. If you have several branches, you may need to enhance the git filter-branch
command to edit all the commits.
来源:https://stackoverflow.com/questions/11052199/convert-git-repository-file-encoding