How to use multi CPU cores to train NNs using caffe and OpenBLAS

独自空忆成欢 提交于 2019-12-18 13:35:06

问题


I am learning deep learning recently and my friend recommended me caffe. After install it with OpenBLAS, I followed the tutorial, MNIST task in the doc. But later I found it was super slow and only one CPU core was working.

The problem is that the servers in my lab don't have GPU, so I have to use CPUs instead.

I Googled this and got some page like this . I tried to export OPENBLAS_NUM_THREADS=8 and export OMP_NUM_THREADS=8. But caffe still used one core.

How can I make caffe use multi CPUs?

Many thanks.


回答1:


@Karthik. That also works for me. One interesting discovery that I made was that using 4 threads reduces forward/backward pass during the caffe timing test by a factor of 2. However, increasing the thread count to 8 or even 24 results in f/b speed that is less than what I get with OPENBLAS_NUM_THREADS=4. Here are times for a few thread counts (tested on NetworkInNetwork model).

[#threads] [f/b time in ms]
1 223
2 150
4 113
8 125
12 144

For comparison, on a Titan X GPU the f/b pass took 1.87 ms.




回答2:


While building OpenBLAS, you have to set the flag USE_OPENMP = 1 to enable OpenMP support. Next set Caffe to use OpenBLAS in the Makefile.config, please export the number of threads you want to use during runtime by setting up OMP_NUM_THREADS=n where n is the number of threads you want.




回答3:


I found that this method works:

When you build the caffe, in your make command, do use this for 8 cores: make all -j8 and make pycaffe -j8

Also, make sure OPENBLAS_NUM_THREADS=8 is set.

This question has a full script for the same.



来源:https://stackoverflow.com/questions/30195837/how-to-use-multi-cpu-cores-to-train-nns-using-caffe-and-openblas

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!