Can I use file_get_contents() to compare two files?

后端 未结 7 1666
无人及你
无人及你 2020-12-03 03:09

I want to synchronize two directories. And I use

file_get_contents($source) === file_get_contents($dest)

to compare two files. Is there an

相关标签:
7条回答
  • 2020-12-03 03:38

    I would rather do something like this:

    function files_are_equal($a, $b)
    {
      // Check if filesize is different
      if(filesize($a) !== filesize($b))
          return false;
    
      // Check if content is different
      $ah = fopen($a, 'rb');
      $bh = fopen($b, 'rb');
    
      $result = true;
      while(!feof($ah))
      {
        if(fread($ah, 8192) != fread($bh, 8192))
        {
          $result = false;
          break;
        }
      }
    
      fclose($ah);
      fclose($bh);
    
      return $result;
    }
    

    This checks if the filesize is the same, and if it is it goes through the file step by step.

    • Checking the modified time check can be a quick way in some cases, but it doesn't really tell you anything other than that the files have been modified at different times. They still might have the same content.
    • Using sha1 or md5 might be a good idea, but this requires going through the whole file to create that hash. If this hash is something that could be stored and used later, then it's a different story probably, but yeah...
    0 讨论(0)
  • 2020-12-03 03:39

    Check first for the obvious:

    1. Compare size
    2. Compare file type (mime-type).
    3. Compare content.

    (add comparison of date, file name and other metadata to this obvious list if those are also not supposed to be similar).

    When comparing content hashing sounds not very efficient like @Oli says in his comment. If the files are different they most likely will be different already in the beginning. Calculating a hash of two 50 Mb files and then comparing the hash sounds like a waste of time if the second bit is already different...

    Check this post on php.net. Looks very similar to that of @Svish but it also compares file mime-type. A smart addition if you ask me.

    0 讨论(0)
  • 2020-12-03 03:43

    There isn't anything wrong with what you are doing here, accept it is a little inefficient. Getting the contents of each file and comparing them, especially with larger files or binary data, you may run into problems.

    I would take a look at filetime (last modified) and filesize, and run some tests to see if that works for you. It should be all you need at a fraction of the computation power.

    0 讨论(0)
  • 2020-12-03 03:49

    Use sha1_file() instead. It's faster and works fine if you just need to see whether the files differ. If the files are large, comparing the whole strings to each other can be very heavy. As sha1_file() returns an 40 character representation of the file, comparing files will be very fast.

    You can also consider other methods like comparing filemtime or filesize, but this will give you guaranteed results even if there's just one bit that's changed.

    0 讨论(0)
  • 2020-12-03 03:51

    Seems a bit heavy. This will load both files completely as strings and then compare.

    I think you might be better off opening both files manually and ticking through them, perhaps just doing a filesize check first.

    0 讨论(0)
  • 2020-12-03 03:58

    Ths will work, but is inherently more inefficient than calculating checksum for both files and comparing these. Good candidates for checksum algorithms are SHA1 and MD5.

    http://php.net/sha1_file

    http://php.net/md5_file

    if (sha1_file($source) == sha1_file($dest)) {
        /* ... */
    }
    
    0 讨论(0)
提交回复
热议问题