Easiest way to compare two Excel files in Java?

前端 未结 11 865
面向向阳花
面向向阳花 2021-02-02 14:23

I\'m writing a JUnit test for some code that produces an Excel file (which is binary). I have another Excel file that contains my expected output. What\'s the easiest way to com

相关标签:
11条回答
  • 2021-02-02 14:28

    The easiest way I find is to use Tika. I use it like this:

    private void compareXlsx(File expected, File result) throws IOException, TikaException {
         Tika tika = new Tika();
         String expectedText = tika.parseToString(expected);
         String resultText = tika.parseToString(result);
         assertEquals(expectedText, resultText);
    }
    
    
    <dependency>
        <groupId>org.apache.tika</groupId>
        <artifactId>tika-parsers</artifactId>
        <version>1.13</version>
        <scope>test</scope>
    </dependency>
    
    0 讨论(0)
  • 2021-02-02 14:29

    You could use javaxdelta to check whether the two files are the same. It's available from here:

    http://javaxdelta.sourceforge.net/

    0 讨论(0)
  • 2021-02-02 14:31

    You may use Beyond Compare 3 which can be started from command-line and supports different ways to compare Excel files, including:

    • Comparing Excel sheets as database tables
    • Checking all textual content
    • Checking textual content with some formating
    0 讨论(0)
  • 2021-02-02 14:32

    Here's what I ended up doing (with the heavy lifting being done by DBUnit):

    /**
     * Compares the data in the two Excel files represented by the given input
     * streams, closing them on completion
     * 
     * @param expected can't be <code>null</code>
     * @param actual can't be <code>null</code>
     * @throws Exception
     */
    private void compareExcelFiles(InputStream expected, InputStream actual)
      throws Exception
    {
      try {
        Assertion.assertEquals(new XlsDataSet(expected), new XlsDataSet(actual));
      }
      finally {
        IOUtils.closeQuietly(expected);
        IOUtils.closeQuietly(actual);
      }
    }
    

    This compares the data in the two files, with no risk of false negatives from any irrelevant metadata that might be different. Hope this helps someone.

    0 讨论(0)
  • 2021-02-02 14:33

    Please, take a look at the site to compare the binary files, http://www.velocityreviews.com/forums/t123770-re-java-code-for-determining-binary-file-equality.html

    Tiger

    0 讨论(0)
  • 2021-02-02 14:35

    A simple file comparison can easily be done using some checksumming (like MD5) or just reading both files.

    However, as Excel files contain loads of metadata, the files will probably never be identical byte-for-byte, as James Burgess pointed out. So you'll need another kind of comparison for your test.

    I'd recommend somehow generating a "canonical" form from the Excel file, i.e. reading the generated Excel file and converting it to a simpler format (CSV or something similar), which will only retain the information you want to check. Then you can use the "canonical form" to compare with your expected result (also in canonical form, of course).

    Apache POI might be useful for reading the file.

    BTW: Reading a whole file to check its correctnes would generally not be considere a Unit test. That's an integration test...

    0 讨论(0)
提交回复
热议问题