How do I create a readable diff of two spreadsheets using git diff?

前端 未结 21 1512
醉话见心
醉话见心 2020-12-04 04:27

We have a lot of spreadsheets (xls) in our source code repository. These are usually edited with gnumeric or openoffice.org, and are mostly used to populate databases for u

相关标签:
21条回答
  • 2020-12-04 05:11

    I would use the SYLK file format if performing diffs is important. It is a text-based format, which should make the comparisons easier and more compact than a binary format. It is compatible with Excel, Gnumeric, and OpenOffice.org as well, so all three tools should be able to work well together. SYLK Wikipedia Article

    0 讨论(0)
  • 2020-12-04 05:13

    I know several responses have suggested exporting the file to csv or some other text format, and then comparing them. I haven't seen it mentioned specifically, but Beyond Compare 3 has a number of additional file formats that it supports. See Additional File Formats. Using one of the Microsoft Excel File Formats you can easily compare two Excel files without going through the export to another format option.

    0 讨论(0)
  • 2020-12-04 05:14

    I'm the co-author of a free, open-source Git extension:

    https://github.com/ZoomerAnalytics/git-xltrail

    It makes Git work with any Excel workbook file format without any workarounds.

    0 讨论(0)
  • 2020-12-04 05:14

    I don't know of any tools, but there are two roll-your-own solutions that come to mind, both require Excel:

    1. You could write some VBA code that steps through each Worksheet, Row, Column and Cell of the two Workbooks, reporting differences.

    2. If you use Excel 2007, you could save the Workbooks as Open-XML (*.xlsx) format, extract the XML and diff that. The Open-XML file is essentially just a .zip file of .xml files and manifests.

    You'll end up with a lot of "noise" in either case if your spreadsheets aren't structurally "close" to begin with.

    0 讨论(0)
  • 2020-12-04 05:16

    We faced the exact same issue in our co. Our tests output excel workbooks. Binary diff was not an option. So we rolled out our own simple command line tool. Check out the ExcelCompare project. Infact this allows us to automate our tests quite nicely. Patches / Feature requests quite welcome!

    0 讨论(0)
  • 2020-12-04 05:17

    I've done a lot of comparing of Excel workbooks in the past. My technique works very well for workbooks with many worksheets, but it only compares cell contents, not cell formatting, macros, etc. Also, there's some coding involved but it's well worth it if you have to compare a lot of large files repeatedly. Here's how it works:

    A) Write a simple dump program that steps through all worksheets and saves all data to tab-separated files. Create one file per worksheet (use the worksheet name as the filename, e.g. "MyWorksheet.tsv"), and create a new folder for these files each time you run the program. Name the folder after the excel filename and add a timestamp, e.g. "20080922-065412-MyExcelFile". I did this in Java using a library called JExcelAPI. It's really quite easy.

    B) Add a Windows shell extension to run your new Java program from step A when right-clicking on an Excel file. This makes it very easy to run this program. You need to Google how to do this, but it's as easy as writing a *.reg file.

    C) Get BeyondCompare. It has a very cool feature to compare delimited data by showing it in a nice table, see screenshot.

    D) You're now ready to compare Excel files with ease. Right-click on Excel file 1 and run your dump program. It will create a folder with one file per worksheet. Right-click on Excel file 2 and run your dump program. It will create a second folder with one file per worksheet. Now use BeyondCompare (BC) to compare the folders. Each file represents a worksheet, so if there are differences in a worksheet BC will show this and you can drill down and do a file comparison. BC will show the comparison in a nice table layout, and you can hide rows and columns you're not interested in.

    0 讨论(0)
提交回复
热议问题