问题
I'm trying to do random write (Benchmark test) to a file using multiple threads (pthread). Looks like if I comment out mutex lock
the created file size is less than actual as if Some writes are getting lost (always in some multiple of chunk size). But if I keep the mutex it's always exact size.
Is my code have a problem in other place and mutex is not really required (as suggested by @evan ) or mutex is necessary here
void *DiskWorker(void *threadarg) {
FILE *theFile = fopen(fileToWrite, "a+");
....
for (long i = 0; i < noOfWrites; ++i) {
//pthread_mutex_lock (&mutexsum);
// For Random access
fseek ( theFile , randomArray[i] * chunkSize , SEEK_SET );
fputs ( data , theFile );
//Or for sequential access (in this case above 2 lines would not be here)
fprintf(theFile, "%s", data);
//sequential access end
fflush (theFile);
//pthread_mutex_unlock(&mutexsum);
}
.....
}
回答1:
You definitely need a mutex because you are issuing several different file commands. The underlying file subsystem can't possibly know how many file commands you are going to call to complete your whole operation.
So you need the mutex.
In your situation you may find you get better performance putting the mutex outside the loop. The reason being that, otherwise, switching between threads may cause excessive skipping between different parts of the disk. Hard disks take about 10ms
to move the read/write head so that could potentially slow things down a lot.
So it might be a good idea to benchmark that.
回答2:
You are opening a file using "append mode". According to C11:
Opening a file with append mode (
'a'
as the first character in the mode argument) causes all subsequent writes to the file to be forced to the then current end-of-file, regardless of intervening calls to thefseek
function.
C standard does not specified how exactly this should be implemented, but on POSIX system this is usually implemented using O_APPEND
flag of open function, while flushing data is done using function write
. Note that fseek
call in your code should have no effect.
I think POSIX requires this, as it describes how redirecting output in append mode (>>
) is done by the shell:
Appended output redirection shall cause the file whose name results from the expansion of word to be opened for output on the designated file descriptor. The file is opened as if the open() function as defined in the System Interfaces volume of POSIX.1-2008 was called with the O_APPEND flag. If the file does not exist, it shall be created.
And since most programs use FILE
interface to send data to stdout
, this probably requires fopen
to use open
with O_APPEND
and write
(and not functions like pwrite
) when writing data.
So if on your system fopen
with 'a'
mode uses O_APPEND
and flushing is done using write
and your kernel and filesystem correctly implement O_APPEND
flag, using mutex should have no effect as writes do not intervene:
If the
O_APPEND
flag of the file status flags is set, the file offset shall be set to the end of the file prior to each write and no intervening file modification operation shall occur between changing the file offset and the write operation.
Note that not all filesystems support this behavior. Check this answer.
As for my answer to your previous question, my suggestion was to remove mutex as it should have no effect on the size of a file (and it didn't have any effect on my machine).
Personally, I never really used O_APPEND
and would be hesitant to do so, as its behavior might not be supported at some level, plus its behavior is weird on Linux (see "bugs" section of pwrite).
来源:https://stackoverflow.com/questions/49377419/do-we-need-mutex-to-perform-multithreading-file-io