I am calling dgemm within my MPI code and it does not uses multiple cores while within sequential code dgemm uses all cores. My machine have 8 cores when I run code using 2 rank