1 安装Ubuntu18.04.03 lts
spt@spt-ts:~$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 18.04.3 LTS Release: 18.04 Codename: bionic spt@spt-ts:~$ df -ah Filesystem Size Used Avail Use% Mounted on udev 3.9G 0 3.9G 0% /dev tmpfs 794M 1.9M 792M 1% /run /dev/sda6 111G 5.5G 100G 6% / /dev/sda1 454M 112M 315M 27% /boot /dev/sdb1 916G 142M 870G 1% /home # swap设置了6GB
找了一个台式机,全盘格式化后,全新安装的Ubuntu18.04.3 LTS
2 安装NVIDIA显卡驱动
spt@spt-ts:~$ lspci | grep -i vga
01:00.0 VGA compatible controller: NVIDIA Corporation GM206 [GeForce GTX 950] (rev a1)
显卡:gtx 950 驱动和CUDA对应版本好要求:
sudo add-apt-repository ppa:graphics-drivers/ppa sudo apt update ubuntu-drivers devices sudo apt install xserver-xorg-core sudo ubuntu-drivers autoinstall
安装了最新的显卡驱动
测试显卡驱动安装结果
spt@spt-ts:~$ nvidia-smi Fri Sep 6 10:50:46 2019 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 435.21 Driver Version: 435.21 CUDA Version: 10.1 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 950 Off | 00000000:01:00.0 On | N/A | | 32% 41C P8 10W / 105W | 207MiB / 2000MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 974 G /usr/lib/xorg/Xorg 13MiB | | 0 1036 G /usr/bin/gnome-shell 48MiB | | 0 1382 G /usr/lib/xorg/Xorg 70MiB | | 0 1509 G /usr/bin/gnome-shell 71MiB | +-----------------------------------------------------------------------------+ spt@spt-ts:~$
3 安装vim ssh服务
对项目没什么用,我主要是想用ssh连接这台机器。
sudo apt install vim openssh-server
4 安装CUDA v10.0
首先根据TensorFlow官方指导,先查好版本兼容性
https://tensorflow.google.cn/install/source 最新版本TensorFlow1.14.0,对应CUDA10.0和cuDNN7.4
1. Download and Run `sudo sh cuda_10.0.130_410.48_linux.run`
2. Download and Run Patch 1 (Released May 10, 2019)
顺便看清楚卸载方式。因为后面测试不同项目,需要不同版本。很有可能需要卸载,然后安装不同版本。
.............................................. To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin uninstall_cuda_10.0.pl
安装后
spt@spt-ts:~$ df -ah Filesystem Size Used Avail Use% Mounted on sysfs 0 0 0 - /sys proc 0 0 0 - /proc udev 3.9G 0 3.9G 0% /dev devpts 0 0 0 - /dev/pts tmpfs 794M 2.0M 792M 1% /run /dev/sda6 111G 12G 94G 11% /
设置环境变量,在/etc/profile或~/.bashrc的文件后面添加
export PATH=/usr/local/cuda/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
5 安装 cuDNN v7.4.2
Download cuDNN v7.4.2 (Dec 14, 2018), for CUDA 10.0
版本号必须匹配上面的CUDA版本
# 下载下面几个文件 Download cuDNN v7.4.2 (Dec 14, 2018), for CUDA 10.0
#cuDNN Library for Linux ---> cudnn-10.0-linux-x64-v7.4.2.24.tgz
#cuDNN Runtime Library for Ubuntu18.04 (Deb)
#cuDNN Developer Library for Ubuntu18.04 (Deb)
#cuDNN Code Samples and User Guide for Ubuntu18.04 (Deb)
cuDNN解压安装
spt@spt-ts:~/work/tensorflow$ pwd /home/spt/work/tensorflow spt@spt-ts:~/work/tensorflow$ tar xvf cudnn-10.0-linux-x64-v7.4.2.24.tgz spt@spt-ts:~/work/tensorflow$ sudo cp cuda/include/cudnn.h /usr/local/cuda/include spt@spt-ts:~/work/tensorflow$ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64 spt@spt-ts:~/work/tensorflow$ sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
6 安装pip3 virtualenv
# 系统默认安装了最新支持版本python3.6
sudo apt install python3-pip python3-dev python-virtualenv
7 安装TensorFlow-GPU v1.14.0
spt@spt-ts:~/work/tensorflow$ pwd /home/spt/work/tensorflow spt@spt-ts:~/work/tensorflow$ mkdir tsenv spt@spt-ts:~/work/tensorflow$ virtualenv -p python3 tsenv spt@spt-ts:~/work/tensorflow$ cd tsenv/ spt@spt-ts:~/work/tensorflow/tsenv$ source bin/activate (tsenv) spt@spt-ts:~/work/tensorflow/tsenv$ pip3 install --index-url http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com --upgrade tensorflow-gpu # 采用国内源阿里巴巴下载tensorflow-gpu # 或者豆瓣 pip3 install --index-url http://pypi.douban.com/simple --trusted-host pypi.douban.com --upgrade tensorflow-gpu
# 查看安装情况
(tsenv) spt@spt-ts:~/work/tensorflow/tsenv$ pip3 show tensorflow-gpu Name: tensorflow-gpu Version: 1.14.0
# 测试
(tsenv) spt@spt-ts:~/work/tensorflow/tsenv/src$ cd /usr/local/cuda/samples/1_Utilities/deviceQuery (tsenv) spt@spt-ts:/usr/local/cuda/samples/1_Utilities/deviceQuery$ sudo make (tsenv) spt@spt-ts:/usr/local/cuda/samples/1_Utilities/deviceQuery$ ./deviceQuery ./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) cudaGetDeviceCount returned 803 -> system has unsupported display driver / cuda driver combination Result = FAIL
# 结论 驱动和CUDA安装后需要重启,打开桌面环境。再次测试
(tsenv) spt@spt-ts:/usr/local/cuda/samples/1_Utilities/deviceQuery$ ./deviceQuery ./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "GeForce GTX 950" CUDA Driver Version / Runtime Version 10.1 / 10.0 CUDA Capability Major/Minor version number: 5.2 Total amount of global memory: 2001 MBytes (2098069504 bytes) ( 6) Multiprocessors, (128) CUDA Cores/MP: 768 CUDA Cores GPU Max Clock rate: 1304 MHz (1.30 GHz) Memory Clock rate: 3305 Mhz Memory Bus Width: 128-bit L2 Cache Size: 1048576 bytes Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096) Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 2 copy engine(s) Run time limit on kernels: Yes Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device supports Compute Preemption: No Supports Cooperative Kernel Launch: No Supports MultiDevice Co-op Kernel Launch: No Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.1, CUDA Runtime Version = 10.0, NumDevs = 1 Result = PASS
8 至此环境搭建完毕
待测试其他