We have a lot of spreadsheets (xls) in our source code repository. These are usually edited with gnumeric or openoffice.org, and are mostly used to populate databases for u
I found an openoffice macro here that will invoke openoffice's compare documents function on two files. Unfortunately, openoffice's spreadsheet compare seems a little flaky; I just had the 'Reject All' button insert a superfluous column in my document.
I got the problem like you so I decide to write small tool to help me out. Please check ExcelDiff_Tools. It comes with several key points:
If you have TortoiseSVN then you can CTRL click the two files to select them in Windows Explorer and then right-click, TortoiseSVN->Diff.
This works particularly well if you are looking for a small change in a large data set.
Do you use TortoiseSVN for doing your commits and updates in subversion? It has a diff tool, however comparing Excel files is still not really user friendly. In my environment (Win XP, Office 2007), it opens up two excel files for side by side comparison.
Right click document > Tortoise SVN > Show Log > select revision > right click for "Compare with working copy".
Use Altova DiffDog
Use diffdog's XML diff mode and Grid View to review the differences in an easy to read tabular format. Text diff'ing is MUCH HARDER for spreadsheets of any complexity. With this tool, at least two methods are viable under various circumstances.
Save As .xml
To detect the differences of a simple, one sheet spreadsheet, save the Excel spreadsheets to compare as XML Spreadsheet 2003 with a .xml extension.
Save As .xlsx
To detect the differences of most spreadsheets in a modularized document model, save the Excel spreadsheets to compare as an Excel Workbook in .xlsx form. Open the files to diff with diffdog. It informs you that the file is a ZIP archive, and asks if you want to open it for directory comparison. Upon agreeing to directory comparison, it becomes a relatively simple matter of double-clicking logical parts of the document to diff them (with the XML diff mode). Most parts of the .xslx document are XML-formatted data. The Grid View is extremely useful. It is trivial to diff individual sheets to focus the analysis on areas that are known to have changed.
Excel's propensity to tweak certain attribute names with every save is annoying, but diffdog's XML diff'ing capabilities include the ability to filter certain kinds of differences. For example, Excel spreadsheets in XML form contain row
and c
elements that have s
attributes (style) that rename with every save. Setting up a filter like c:s
makes it much easier to view only content changes.
diffdog has a lot of diff'ing capability. I've listed the XML diff modes only simply because I haven't used another tool that I liked better when it comes to differencing Excel documents.
Convert to cvs then upload to a version control system then diff with an advanced version control diff tool. When I used perforce it had a great diff tool, but I forget the name of it.