How to perform better document version control on Excel files and SQL schema files

前端 未结 9 1138
执念已碎
执念已碎 2020-11-27 09:29

I am in charge of several Excel files and SQL schema files. How should I perform better document version control on these files?

I need to know the part modified (di

相关标签:
9条回答
  • 2020-11-27 09:59

    Since you've tagged your question with git I assume you are asking about Git usage for this.

    Well, SQL dumps are normal text files so it makes perfect sense to track them with Git. Just create a repository and store them in it. When you get a new version of a file, simply overwrite it and commit, Git will figure out everything for you, and you'll be able to see modification dates, checkout specific versions of this file and compare different versions.

    The same is true for .xlsx if you decompress them. .xlsx files are zipped up directories of XML files (See How to properly assemble a valid xlsx file from its internal sub-components?). Git will view them as binary unless decompressed. It is possible to unzip the .xlsx and track the changes to the individual XML files inside of the archive.

    You could also do this with .xls files, but the problem here is that .xls format is binary, so you can't get meaningful diffs from it. But you'll still be able to see modification history and checkout specific versions.

    0 讨论(0)
  • 2020-11-27 10:07

    My approach with Excel files is similar to Jon's, but instead of working with the raw Excel text data I export to more friendly formats.

    Here is the tool that I use: https://github.com/stenci/ExcelToGit/tree/master

    All you need is to download the .xlsm file (click the View Raw link on this page.) Don't forget to check the Excel setting as described in the readme. You can also add the code to export SQL data to text files.

    The workbook is both a converter from binary Excel to text files and a launcher of the windows Git tools, and it can be used also with non Excel related projects.

    My working version is configured with dozens of Excel workbooks. I use the file also to open Git-gui for non Excel projects, just adding the git folder by hand.

    0 讨论(0)
  • 2020-11-27 10:08

    The answer I have written here can be applied in this case. A tool called xls2txt can provide human-readable output from .xls files. So in short, you should put this to your .gitattributes file:

    *.xls diff=xls
    

    And in the .git/config:

    [diff "xls"]
        binary = true
        textconv = /path/to/xls2txt
    

    Of course, I'm sure you can find similar tools for other file types as well, making git diff a very useful tool for office documents. This is what I currently have in my global .gitconfig:

    [diff "xls"]
        binary = true
        textconv = /usr/bin/py_xls2txt
    [diff "pdf"]
        binary = true
        textconv = /usr/bin/pdf2txt
    [diff "doc"]
        binary = true
        textconv = /usr/bin/catdoc
    [diff "docx"]
        binary = true
        textconv = /usr/bin/docx2txt
    

    The Pro Git book has a good chapter on the subject: 8.2 Customizing Git - Git Attributes

    0 讨论(0)
提交回复
热议问题