问题
Trying to create 1Mb(1048576Byte) file by writing in various chunk sizes and a different number of threads. When int NUM_THREADS = 2
or int NUM_THREADS = 1
then created file size is same as given i.e. 10MB .
However when I increase thread count to 4, The created file size is around 400MB; Why this anomaly?
#include <pthread.h>
#include <string>
#include <iostream>
#define TenGBtoByte 1048576
#define fileToWrite "/tmp/schatterjee.txt"
using namespace std;
pthread_mutex_t mutexsum;
struct workDetails {
int threadcount;
int chunkSize;
char *data;
};
void *SPWork(void *threadarg) {
struct workDetails *thisWork;
thisWork = (struct workDetails *) threadarg;
int threadcount = thisWork->threadcount;
int chunkSize = thisWork->chunkSize;
char *data = thisWork->data;
long noOfWrites = (TenGBtoByte / (threadcount * chunkSize));
FILE *f = fopen(fileToWrite, "a+");
for (long i = 0; i < noOfWrites; ++i) {
pthread_mutex_lock(&mutexsum);
fprintf(f, "%s", data);
fflush (f);
pthread_mutex_unlock(&mutexsum);
}
fclose(f);
pthread_exit((void *) NULL);
}
int main(int argc, char *argv[]) {
int blocksize[] = {1024};
int NUM_THREADS = 2;
for (int BLOCKSIZE: blocksize) {
char *data = new char[BLOCKSIZE];
fill_n(data, BLOCKSIZE, 'x');
pthread_t thread[NUM_THREADS];
workDetails detail[NUM_THREADS];
pthread_attr_t attr;
int rc;
long threadNo;
void *status;
/* Initialize and set thread detached attribute */
pthread_mutex_init(&mutexsum, NULL);
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
for (threadNo = 0; threadNo < NUM_THREADS; threadNo++) {
detail[threadNo].threadcount = NUM_THREADS;
detail[threadNo].chunkSize = BLOCKSIZE;
detail[threadNo].data = data;
rc = pthread_create(&thread[threadNo], &attr, SPWork, (void *) &detail[threadNo]);
if (rc) exit(-1);
}
pthread_attr_destroy(&attr);
for (threadNo = 0; threadNo < NUM_THREADS; threadNo++) {
rc = pthread_join(thread[threadNo], &status);
if (rc) exit(-1);
}
pthread_mutex_destroy(&mutexsum);
delete[] data;
}
pthread_exit(NULL);
}
N.B. -
1)It's a benchmarking task, so doing as they asked in requirement.
2) long noOfWrites = (TenGBtoByte / (threadcount * chunkSize));
basically computing how many times each thread should write to get the combined size of 10MB.
4)I tried to put Mutex lock at various position . All yeild in same result
Suggestions about other changes in the programme is also welcome
回答1:
You are allocating and initializing your data array like this:
char *data = new char[BLOCKSIZE];
fill_n(data, BLOCKSIZE, 'x');
Then you are writing it to file using fprintf
:
fprintf(f, "%s", data);
Function fprintf
expects data
to be a null-terminated string. This is an undefined behavior already. If this worked with low number of threads, it is because memory after than memory chunk happen to contain zero byte.
Other than that, mutex in your program serves no purpose and can be removed. File locking is also redundant, so you can use fwrite_unlocked
and fflush_unlocked
to write your data since every thread uses separate FILE
object. Essentially all synchronization in your program is performed in the kernel, not in userspace.
Even after removing mutex and using _unlocked
functions your program reliably creates 1 MB files regardless of number of threads. So invalid file writing seems to be the only issue you have.
回答2:
@Ivan Yes! Yes! Yes! .You are absolutely right my friend. Except for a small fact. The mutex is necessary. This is the final code. Try removing mutex and file size will be different.
#include <pthread.h>
#include <string>
#include <iostream>
#define TenGBtoByte 1048576
#define fileToWrite "/tmp/schatterjee.txt"
using namespace std;
pthread_mutex_t mutexsum;;
struct workDetails {
int threadcount;
int chunkSize;
char *data;
};
void *SPWork(void *threadarg) {
struct workDetails *thisWork;
thisWork = (struct workDetails *) threadarg;
int threadcount = thisWork->threadcount;
int chunkSize = thisWork->chunkSize;
char *data = thisWork->data;
long noOfWrites = (TenGBtoByte / (threadcount * chunkSize));
FILE *f = fopen(fileToWrite, "a+");
for (long i = 0; i < noOfWrites; ++i) {
pthread_mutex_lock(&mutexsum);
fprintf(f, "%s", data);
fflush (f);
pthread_mutex_unlock(&mutexsum);
}
fclose(f);
pthread_exit((void *) NULL);
}
int main(int argc, char *argv[]) {
int blocksize[] = {1024};
int NUM_THREADS = 128;
for (int BLOCKSIZE: blocksize) {
char *data = new char[BLOCKSIZE+1];
fill_n(data, BLOCKSIZE, 'x');
data[BLOCKSIZE] = NULL;
pthread_t thread[NUM_THREADS];
workDetails detail[NUM_THREADS];
pthread_attr_t attr;
int rc;
long threadNo;
void *status;
pthread_mutex_init(&mutexsum, NULL);
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
for (threadNo = 0; threadNo < NUM_THREADS; threadNo++) {
detail[threadNo].threadcount = NUM_THREADS;
detail[threadNo].chunkSize = BLOCKSIZE;
detail[threadNo].data = data;
rc = pthread_create(&thread[threadNo], &attr, SPWork, (void *) &detail[threadNo]);
if (rc) exit(-1);
}
pthread_attr_destroy(&attr);
for (threadNo = 0; threadNo < NUM_THREADS; threadNo++) {
rc = pthread_join(thread[threadNo], &status);
if (rc) exit(-1);
}
pthread_mutex_destroy(&mutexsum);
delete[] data;
}
pthread_exit(NULL);
}
来源:https://stackoverflow.com/questions/49248431/multithreading-file-io-program-behaves-unpredictably-when-number-of-thread-is-in