问题
For a file containing few bytes under Linux, I need only to process when it was changed since the last time it was processed.
I check whether the file was changed by calling PHP clearstatcache(); filemtime();
periodically.
Since the entire file will always be tiny, would it be a performance improvement to remove the call to filemtime and check for a file change by comparing the contents with the past contents?
Or what is the best method for that, in terms of performance.
回答1:
Use filemtime + clearstatcache
To enhance @Ben_D's test:
<?php
$file = 'small_file.html';
$loops = 1000000;
// filesize (fast)
$start_time = microtime(1);
for ($i = 0; $i < $loops; $i++) {
$file_size = filesize($file);
}
$end_time = microtime(1);
$time_for_file_size = $end_time - $start_time;
// filemtime (fastest)
$start_time = microtime(1);
for ($i = 0; $i < $loops; $i++) {
$file_mtime = filemtime($file);
}
$end_time = microtime(1);
$time_for_filemtime = $end_time - $start_time;
// filemtime + no cache (fast and reliable)
$start_time = microtime(1);
for ($i = 0; $i < $loops; $i++) {
clearstatcache();
$file_mtime_nc = filemtime($file);
}
$end_time = microtime(1);
$time_for_filemtime_nc = $end_time - $start_time;
// file_get_contents (slow and reliable)
$start_time = microtime(1);
for ($i = 0; $i < $loops; $i++) {
$file_contents = file_get_contents($file);
}
$end_time = microtime(1);
$time_for_file_get_contents = $end_time - $start_time;
// output
echo "
<p>Working on file '$file'</p>
<p>Size: $file_size B</p>
<p>last modified timestamp: $file_mtime</p>
<p>file contents: $file_contents</p>
<h1>Profile</h1>
<p>filesize: $time_for_file_size</p>
<p>filemtime: $time_for_filemtime</p>
<p>filemtime + no cache: $time_for_filemtime_nc</p>
<p>file_get_contents: $time_for_file_get_contents</p>";
/* End of file */
回答2:
I know I'm late to the party, but a little benchmarking never hurt a discussion. Brian Roach's intuition proves sounds, even before you take into account the comparison step:
The Test:
$file = "small_file.html";
$file_size = filesize($file);
//get the filemtime 1,000,000 times
$start_time = microtime(true);
for($i=0;$i<1000000;$i++){
$set_time = filemtime($file);
}
$end_time = microtime(true);
$time_for_filemtime = ($end_time-$start_time);
//get the time for file_get_contents 1,000,000 times
$start_time = microtime(true);
$file = "small_file.html";
for($i=0;$i<1000000;$i++){
$set_time = file_get_contents($file);
}
$end_time = microtime(true);
$time_for_file_get_contents = ($end_time-$start_time);
echo "<p>Working on a file that is $file_size B long</p>
<p>filemtime: $time_for_filemtime vs file_get_contents: $time_for_file_get_contents";
The Results
Working on a file that is 41 B long
filemtime: 0.36287999153137 vs file_get_contents: 16.191468000412
No shocker: "asking the file system for some metadata" is faster than "opening the file, reading it in, and comparing the contents."
回答3:
To stat the file, you're simply asking the file system for some metadata.
Your second approach involves opening the file, reading it in, and comparing the contents.
Which do you think would be faster? ;)
回答4:
I think the best method to be notified about changes to a file is inotify
, which is designed for exactly this purpose.
See the inotify extension.
来源:https://stackoverflow.com/questions/5850180/cost-of-file-modification-time-checks