1.docker运行命令:
docker run --runtime=nvidia -dit --name=my-develop --publish=39822:22 --volume=/home/my/remote_develop:/remote_develop --restart=always euleros-cuda-py373:0.1.0 /bin/bash
docker exec -it my-develop /bin/bash
#【ctrl】+【p】 【ctrl】+【q】 (皮球pi_qiu)退出
2.测试GPU:
# 进入镜像
nvidia-docker run -it mxnet/python:gpu bash
#执行python
python
import mxnet as mx
a = mx.nd.ones((2, 3), mx.gpu())
b = a * 2 + 1
b.asnumpy()
参考资料:
https://zhuanlan.zhihu.com/p/27114995
https://blog.51cto.com/5249302/2359420
深度学习环境配置docker+pycharm+GPU
ttps://blog.csdn.net/Ryanpinwei/article/details/78806052
https://cloud.tencent.com/developer/article/1422566
https://blog.51cto.com/5249302/2359420
https://cloud.tencent.com/developer/article/1422566
错误:
nvidia-docker run -it mxnet/python:gpu bash
描述:
docker: Error response from daemon: OCI runtime create failed: container_linux.go:346: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 1 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process request\\\\n\\\"\"": unknown.
ERRO[0001] error waiting for container: context canceled
处理:
sudo su
echo /var/IEF/nvidia/lib64/ > /etc/ld.so.conf.d/nvidia_driver.conf
ldconfig -v
exit
sudo chmod 777 /var
sudo chmod 777 /var/IEF
sudo ls /var/IEF/nvidia -d|xargs -I{} sudo chmod 777 {}
sudo ls /var/IEF/* -ld|grep -E "^d"|awk '{print "sudo chmod 777 "$9}'
sudo ls /var/IEF/* -ld|grep -E "^d"|awk '{print "sudo chmod 777 "$9}'|sh
sudo ls /var/IEF/*/* -ld|grep -E "^d"|awk '{print "sudo chmod 777 "$9}'|sh
sudo ls /var/IEF/*/*/* -ld|grep -E "^d"|awk '{print "sudo chmod 777 "$9}'|sh
sudo ls /var/IEF/*/*/*/* -ld|grep -E "^d"|awk '{print "sudo chmod 777 "$9}'|sh
sudo ls -ld /var/IEF/nvidia/* |grep "\-rwxr\-x"|awk '{print "sudo chmod 755 " $9}'|sh
sudo ls -ld /var/IEF/nvidia/drivers/* |grep "\-rwxr\-x"|awk '{print "sudo chmod 755 " $9}'|sh
sudo ls -ld /var/IEF/nvidia/bin/* |grep "\-rwxr\-x"|awk '{print "sudo chmod 755 " $9}'|sh
sudo ls -ld /var/IEF/nvidia/lib64/* |grep "\-rwxr\-x"|awk '{print "sudo chmod 755 " $9}'|sh
sudo ls -ld /var/IEF/nvidia/lib64/vdpau/* |grep "\-rwxr\-x"|awk '{print "sudo chmod 755 " $9}'|sh
nvidia-container-cli -k -d /dev/tty info
描述:
docker run --runtime=nvidia nvidia/cuda:9.0-base nvidia-smi
docker: Error response from daemon: OCI runtime create failed: container_linux.go:346: starting container process caused "exec: \"nvidia-smi\": executable file not found in $PATH": unknown.
ERRO[0001] error waiting for container: context canceled
处理:
docker run --runtime=nvidia -v /var/IEF/nvidia:/usr/local/nvidia nvidia/cuda:9.0-base nvidia-smi
描述:
docker: Error response from daemon: OCI runtime create failed: container_linux.go:346: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 1 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: mount error: stat failed: /dev/nvidia-modeset: no such file or directory\\\\n\\\"\"": unknown.
ERRO[0001] error waiting for container: context canceled
处理:
docker run是添加参数,不能包含display
NVIDIA_DRIVER_CAPABILITIES
此选项控制将在容器内部安装哪些驱动程序库/二进制文件。
可能的值
compute,video,graphics,utility…:以逗号分隔的驱动程序列表,列出了容器所需的功能。
all:启用所有可用的驱动程序功能。
空或未设置:使用默认驱动程序功能:utility。
支持的驱动程序功能
compute:对于CUDA和OpenCL应用程序是必需的。
compat32:运行32位应用程序所需。
graphics:运行OpenGL和Vulkan应用程序所需。
utility:使用nvidia-smi和NVML必需。
video:使用视频编解码器SDK所需。
display:使用X11显示器需要。
docker run --env NVIDIA_DRIVER_CAPABILITIES=compute,graphics,utility,video
来源:oschina
链接:https://my.oschina.net/mengyoufengyu/blog/3159287