I am a newbie in programming with OpenMp. I wrote a simple c program to multiply matrix with a vector. Unfortunately, by comparing executing time I found that the OpenMP is much
I did this in reference to Hristo's comment. I tried using schedule(static, 256). For me it makes it does not help changing the default chunck size. Maybe it even makes it worse. I printed out the thread number and its index with and without setting the schedule and it's clear that OpenMP already chooses the thread indices to be far from one another so that false sharing does not seem to be an issue. For me this code already gives a good boost with OpenMP.
#include "stdio.h"
#include
void loop_parallel(const int *matrix, const int ld, const int*vector, long long* result, const int m_size) {
#pragma omp parallel for schedule(static, 250)
//#pragma omp parallel for
for (int i=0;i