How should I interpreter these VTune results?
I'm trying to parallelyzing this code using OpenMP. OpenCV (built using IPP for best efficiency) is used as external library. I'm having problems unbalanced CPU usage in parallel for s, but it seems that there is no load imbalance. As you will see, this could be because of KMP_BLOCKTIME=0 , but this could be necessary because of external libraries (IPP, TBB, OpenMP, OpenCV). In the rest of the questions you will find more details and data that you can download. These are the Google Drive links to my VTune results: c755823 basic KMP_BLOCKTIME=0 30 runs : basic hotspot with environment variable