Java: How to check that 2 binary files are same?

前端 未结 7 1294
旧时难觅i
旧时难觅i 2021-02-07 17:27

What is the easiest way to check (in a unit test) whether binary files A and B are equal?

相关标签:
7条回答
  • 2021-02-07 18:09

    I had to do the same in a unit test too, so I used SHA1 hashes to do that, to spare the the calculation of the hashes I check if the files sizes are equal first. Here was my attempt:

    public class SHA1Compare {
        private static final int CHUNK_SIZE = 4096;
    
        public void assertEqualsSHA1(String expectedPath, String actualPath) throws IOException, NoSuchAlgorithmException {
            File expectedFile = new File(expectedPath);
            File actualFile = new File(actualPath);
            Assert.assertEquals(expectedFile.length(), actualFile.length());
            try (FileInputStream fisExpected = new FileInputStream(actualFile);
                    FileInputStream fisActual = new FileInputStream(expectedFile)) {
                Assert.assertEquals(makeMessageDigest(fisExpected), 
                        makeMessageDigest(fisActual));
            }
        }
    
        public String makeMessageDigest(InputStream is) throws NoSuchAlgorithmException, IOException {
            byte[] data = new byte[CHUNK_SIZE];
            MessageDigest md = MessageDigest.getInstance("SHA1");
            int bytesRead = 0;
            while(-1 != (bytesRead = is.read(data, 0, CHUNK_SIZE))) {
                md.update(data, 0, bytesRead);
            }
            return toHexString(md.digest());
        }
    
        private String toHexString(byte[] digest) {
            StringBuilder sha1HexString = new StringBuilder();
            for(int i = 0; i < digest.length; i++) {
                sha1HexString.append(String.format("%1$02x", Byte.valueOf(digest[i])));
            }
            return sha1HexString.toString();
        }
    }
    
    0 讨论(0)
  • 2021-02-07 18:15

    Are third-party libraries fair game? Guava has Files.equal(File, File). There's no real reason to bother with hashing if you don't have to; it can only be less efficient.

    0 讨论(0)
  • 2021-02-07 18:16

    With assertBinaryEquals.

    public static void assertBinaryEquals(java.io.File expected, java.io.File actual)

    http://junit-addons.sourceforge.net/junitx/framework/FileAssert.html

    0 讨论(0)
  • 2021-02-07 18:21

    If you want to avoid dependencies you can do it using quite nicely with Files.readAllBytes and Assert.assertArrayEquals

    Assert.assertArrayEquals("Binary files differ", 
        Files.readAllBytes(Paths.get(expectedBinaryFile)), 
        Files.readAllBytes(Paths.get(actualBinaryFile)));
    

    Note: This will read the whole file so it might not be efficient with large files.

    0 讨论(0)
  • 2021-02-07 18:22

    Since Java 12 you could also use the Files.mismatch method JavaDoc. It will return -1L if the files are the same.

    0 讨论(0)
  • 2021-02-07 18:23

    There's always just reading byte by byte from each file and comparing them as you go. Md5 and Sha1 etc still have to read all the bytes so computing the hash is extra work that you don't have to do.

    if(file1.length() != file2.length()){
            return false;
     }
    
     try(   InputStream in1 =new BufferedInputStream(new FileInputStream(file1));
        InputStream in2 =new BufferedInputStream(new FileInputStream(file2));
     ){
    
          int value1,value2;
          do{
               //since we're buffered read() isn't expensive
               value1 = in1.read();
               value2 = in2.read();
               if(value1 !=value2){
               return false;
               }
          }while(value1 >=0);
    
     //since we already checked that the file sizes are equal 
     //if we're here we reached the end of both files without a mismatch
     return true;
    }
    
    0 讨论(0)
提交回复
热议问题