问题
I have the following problem: My C program must count the number of occurrences of a list of words in a text file.
I use OpenMP for this, and the program, in theory, has the correct logic. When I put some printfs
inside a For Loop
the result of the program is correct and always the same.
When I remove printfs
the result is incorrect, and with each execution its value changes. Given this scenario I think the reason is related to the execution time. With printfs
the execution time is increased, so there is time to finish counting all threads and the program to work correctly. Without prinfts
, the execution time decreases exponentially (0.000893 ms), leaving no time to finish all threads / calculations and for this reason the program prints a different result for each execution.
The parallelized code is as follows:
#pragma omp parallel for schedule(dynamic) num_threads(threadNumber) private(word, wordExists) shared(keyWordsOcurrences)
for (line = 0; line < NUM_LINES; line++)
{
// divides the line into words separated by space
word = strtok(lines[line], " ");
while (word != NULL)
{
// checks if the word being read is one of the monitored words
wordExists = checkWordOcurrences(word);
if (wordExists)
{
#pragma omp critical
keyWordsOcurrences[wordExists - 1] += 1;
}
word = strtok(NULL, " ");
}
}
The checkWordOcurrences function called is where I put the printf responsible to make my code work properly in every execution (increasing execution time).
int checkWordOcurrences(char *word)
{
int res = 0;
int i;
for (i = 0; i < QTD_WORDS; i++)
{
// **this is the almighty Printf that makes everything work properly, and without it things stop working :(**
printf("palavra %d %s - palavra 2 %s \n", i, keyWords[i], word);
// compares current word with monitored words
if (!strcmp(keyWords[i], word))
{
// if it's monitored word, returns its index (+1 because the first word has index 0 and the return type is checked as true or false)
res = i + 1;
}
}
// returns word index or 0, if current word is not monitored
return res;
}
Can someone explain to me what may be happening and / or how to solve it?
回答1:
There is an implicit barrier at the end of the OpenMP for
construct and at the end of each parallel region, so it is not possible for the program to finish before all the threads have finished their assigned work.
The problem is most likely caused by the use of strtok
. It is not a thread-safe function since the position of the search point is stored internally in the C library. When one thread is in the middle of tokenising something and another thread calls strtok(lines[line], " ");
, this overwrites the pointer to the string being searched and now all other threads calling strtok(NULL, " ");
are tokenising the newly set string instead of the string they were in the middle of tokenising before. It is a classical case of data race.
The solution is to use strtok_r
instead.
#pragma omp parallel for schedule(dynamic) num_threads(threadNumber) private(word, wordExists) shared(keyWordsOcurrences)
for (line = 0; line < NUM_LINES; line++)
{
char *saveptr;
// divides the line into words separated by space
word = strtok_r(lines[line], " ", &saveptr);
while (word != NULL)
{
// checks if the word being read is one of the monitored words
wordExists = checkWordOcurrences(word);
if (wordExists)
{
#pragma omp critical
keyWordsOcurrences[wordExists - 1] += 1;
}
word = strtok_r(NULL, " ", &saveptr);
}
}
On a separate account, critical
is a very heavyweight synchronisation construct implemented with locks. Simple increments such as keyWordsOcurrences[wordExists - 1] += 1;
can be protected with atomic updates instead, which are way quicker:
if (wordExists)
{
#pragma omp atomic update
keyWordsOcurrences[wordExists - 1] += 1;
}
If QTD_WORDS
isn't a very large number, you may also use array reduction:
#pragma omp parallel for schedule(dynamic) num_threads(threadNumber) \
private(word, wordExists) \
reduction(+:keyWordsOcurrences[0:QTD_WORDS])
for (line = 0; line < NUM_LINES; line++)
{
char *saveptr;
// divides the line into words separated by space
word = strtok_r(lines[line], " ", &saveptr);
while (word != NULL)
{
// checks if the word being read is one of the monitored words
wordExists = checkWordOcurrences(word);
if (wordExists)
{
keyWordsOcurrences[wordExists - 1] += 1;
}
word = strtok_r(NULL, " ", &saveptr);
}
}
Array reduction for C and C++ is a relatively new OpenMP feature though and requires a compiler that supports OpenMP 4.5. It is possible to do it by hand for older compilers, but that goes way out of the scope of the original question.
来源:https://stackoverflow.com/questions/64745737/openmp-not-waiting-all-threads-finish-before-end-c-program