Multi-threaded GEMM slower than single threaded one?

后端 未结 3 1610
半阙折子戏
半阙折子戏 2020-12-21 02:45

I wrote some Naiive GEMM code and I am wondering why it is much slower than the equivalent single threaded GEMM code.

With a 200x200 matrix, Single Threaded: 7ms, Mu

3条回答
  •  囚心锁ツ
    2020-12-21 03:46

    In general multi-threading is well applicable for tasks which take a lot of time, most favourably because of complexity and not device access. The loop you showed us takes to short to execute for it to be effectively parallelized.

    You have to remember that there is much overhead with thread creation. There is also some (but significantly less) overhead with synchronization.

提交回复
热议问题