Docker 安装 detectron2

系统环境：Ubuntu 16.04.4

安装流程

从 Docker 容器官网 pull 容器镜像文件：nvidia/cuda:10.1-cudnn7-devel

$ docker pull nvidia/cuda:10.1-cudnn7-devel
10.1-cudnn7-devel: Pulling from nvidia/cuda
7ddbc47eeb70: Already exists
c1bbdc448b72: Already exists
8c3b70e39044: Already exists
45d437916d57: Already exists
d8f1569ddae6: Pull complete
85386706b020: Pull complete
ee9b457b77d0: Pull complete
be4f3343ecd3: Pull complete
30b4effda4fd: Pull complete
b398e882f414: Pull complete
Digest: sha256:557de4ba2cb674029ffb602bed8f748d44d59bb7db9daa746ea72a102406d3ec
Status: Downloaded newer image for nvidia/cuda:10.1-cudnn7-devel
docker.io/nvidia/cuda:10.1-cudnn7-devel
# 新建 Dockerfile 配置文件
$ vi Dockerfile

新建 Dockerfile 配置文件, 内容如下：

FROM nvidia/cuda:10.1-cudnn7-devel

ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update && apt-get install -y \
	python3-opencv ca-certificates python3-dev git wget sudo && \
  rm -rf /var/lib/apt/lists/*

# create a non-root user
ARG USER_ID=1000
RUN useradd -m --no-log-init --system  --uid ${USER_ID} leaf -g sudo
RUN echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
USER leaf
WORKDIR /home/leaf

ENV PATH="/home/leaf/.local/bin:${PATH}"
RUN wget https://bootstrap.pypa.io/get-pip.py && \
	python3 get-pip.py --user && \
	rm get-pip.py

# install dependencies
# See https://pytorch.org/ for other options if you use a different version of CUDA
RUN pip install --user torch torchvision tensorboard cython -i https://pypi.tuna.tsinghua.edu.cn/simple
RUN pip install --user 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'

RUN pip install --user 'git+https://github.com/facebookresearch/fvcore'
# install detectron2
RUN git clone https://github.com/facebookresearch/detectron2 detectron2_repo
ENV FORCE_CUDA="1"
# This will build detectron2 for all common cuda architectures and take a lot more time,
# because inside `docker build`, there is no way to tell which architecture will be used.
ENV TORCH_CUDA_ARCH_LIST="Kepler;Kepler+Tesla;Maxwell;Maxwell+Tegra;Pascal;Volta;Turing"
RUN pip install --user -e detectron2_repo

# Set a fixed model cache directory.
ENV FVCORE_CACHE="/tmp"
WORKDIR /home/leaf/detectron2_repo

根据 Dockfile 配置文件，构建 detecron2 镜像。

# -t 选项指定“要创建的目标镜像名”，. 表示 Dockerfile 文件所在目录，也可指定 Dockerfile 的绝对路径
$ docker build -t dawn/centos:dev .
+++
...
Successfully installed Pillow-6.2.2 cloudpickle-1.2.2 detectron2 tabulate-0.8.6
Removing intermediate container 34080cfc4186
 ---> 1f7f046d0540
Step 18/19 : ENV FVCORE_CACHE="/tmp"
 ---> Running in 1e604b777530
Removing intermediate container 1e604b777530
 ---> 5b5496be4934
Step 19/19 : WORKDIR /home/leaf/detectron2_repo
 ---> Running in 9a38d8cb57d4
Removing intermediate container 9a38d8cb57d4
 ---> 22e3d06ee1a9
Successfully built 22e3d06ee1a9
Successfully tagged detectron2:latest
# 查看自定义的镜像是否构建成功
$ docker images
REPOSITORY                                                              TAG                   IMAGE ID            CREATED             SIZE
detectron2                                                              latest                22e3d06ee1a9        6 minutes ago       6.74GB
...

测试环境是否搭建成功

# 164 服务器：由于 nvcc 显示的版本为 10.1， 而 nvidia-smi 版本为 10.2，是否会导致未知错误暂时未知
s164@ml_2:~/shared_dir$ docker run --gpus all -it detectron2 /bin/bash
leaf@a81cf98eac7e:~/detectron2_repo$ nvidia-smi
Thu Jan  9 03:32:14 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.26       Driver Version: 430.26       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:02:00.0 Off |                  N/A |
| 25%   39C    P0    59W / 250W |      0MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  Off  | 00000000:82:00.0 Off |                  N/A |
| 19%   32C    P0    53W / 250W |      0MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
leaf@a81cf98eac7e:~/detectron2_repo$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

leaf@a81cf98eac7e:~/detectron2_repo$ ls

GETTING_STARTED.md  README.md  demo                 docker     setup.py
INSTALL.md          build      detectron2           docs       tests
LICENSE             configs    detectron2.egg-info  projects   tools
MODEL_ZOO.md        datasets   dev                  setup.cfg

# 准备测试用图片
leaf@a81cf98eac7e:~/detectron2_repo$ wget http://images.cocodataset.org/val2017/000000439715.jpg -O input.jpg

# 开始测试
leaf@a81cf98eac7e:~/detectron2_repo$ python3
Python 3.6.9 (default, Nov  7 2019, 10:44:02)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch, torchvision
>>> import detectron2
Failed to load OpenCL runtime
>>> from detectron2.utils.logger import setup_logger
>>> setup_logger()
<Logger detectron2 (DEBUG)>
>>>
>>> # import some common libraries
... import numpy as np
>>> import cv2
>>> import random
>>> from detectron2 import model_zoo
>>> from detectron2.engine import DefaultPredictor
>>> from detectron2.config import get_cfg
>>> from detectron2.utils.visualizer import Visualizer
>>> from detectron2.data import MetadataCatalog
>>> im = cv2.imread("./input.jpg")
>>> im.shape
(480, 640, 3)
>>> cfg = get_cfg()
>>> # add project-specific config (e.g., TensorMask) here if you're not running                                                                                                                                                              a model in detectron2's core library
... cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mas                                                                                                                                                             k_rcnn_R_50_FPN_3x.yaml"))
>>> cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5  # set threshold for this model
>>> # Find a model from detectron2's model zoo. You can either use the https://d                                                                                                                                                             l.fbaipublicfiles.... url, or use the detectron2:// shorthand
... cfg.MODEL.WEIGHTS = "detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_F                                                                                                                                                             PN_3x/137849600/model_final_f10217.pkl"
>>> predictor = DefaultPredictor(cfg)
>>> outputs = predictor(im)
>>> outputs["instances"].pred_classes
tensor([17,  0,  0,  0,  0,  0,  0,  0, 25,  0, 25, 25,  0,  0, 24],
       device='cuda:0')
>>> outputs["instances"].pred_boxes
Boxes(tensor([[126.6035, 244.8977, 459.8291, 480.0000],
        [251.1083, 157.8127, 338.9731, 413.6379],
        [114.8496, 268.6864, 148.2352, 398.8111],
        [  0.8217, 281.0327,  78.6072, 478.4210],
        [ 49.3954, 274.1229,  80.1545, 342.9808],
        [561.2248, 271.5816, 596.2755, 385.2552],
        [385.9072, 270.3125, 413.7130, 304.0397],
        [515.9295, 278.3744, 562.2792, 389.3802],
        [335.2409, 251.9167, 414.7491, 275.9375],
        [350.9300, 269.2060, 386.0984, 297.9081],
        [331.6292, 230.9996, 393.2759, 257.2009],
        [510.7349, 263.2656, 570.9865, 295.9194],
        [409.0841, 271.8646, 460.5582, 356.8722],
        [506.8767, 283.3257, 529.9403, 324.0392],
        [594.5663, 283.4820, 609.0577, 311.4124]], device='cuda:0'))
>>> v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)
>>> v = v.draw_instance_predictions(outputs["instances"].to("cpu"))
>>> v.save('pred_result.png')
>>> exit()  # 退出  python
leaf@a81cf98eac7e:~/detectron2_repo$

由于每次退出 Docker 容器后，会默认删除下载的文件或安装的软件。为了保存 Docker 容器中所进行的操作，可以使用 docker commit 命令:

# 首先需要另外开启一个终端，此时不能退出正在运行的终端，否则文件等信息将都会被清除。 上述正在运行的 Docker 容器镜像 ID 为：a81cf98eac7e

s164@ml_2:~$ docker commit --help

Usage:  docker commit [OPTIONS] CONTAINER [REPOSITORY[:TAG]]

Create a new image from a container's changes

Options:
  -a, --author string    Author (e.g., "John Hannibal Smith
                         <hannibal@a-team.com>")
  -c, --change list      Apply Dockerfile instruction to the created image
  -m, --message string   Commit message
  -p, --pause            Pause container during commit (default true)
s164@ml_2:~$ docker commit -m="vim installed" -a="s164" a81cf98eac7e detectron2
sha256:7ff9a53a9f656fc30cf50f9b5b04ddadebfc9e57c81021ad07c0a354880e4b83

由于服务器中无法查看图片，可以通过复制 Docker 容器中的图片到宿主机进行查看。

# 复制 docker 容器中的文件到宿主机
$ s164@ml_2:~/shared_dir$ docker cp a81cf98eac7e:/home/leaf/detectron2_repo/pred_result.png Documents

常见报错

(1) 在创建 Docker 容器时，因为网络原因，可能会出现如下错误：

pip._vendor.urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out.
The command '/bin/sh -c pip install --user 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'' returned a non-zero code: 2

解决办法：重新运行一遍 Docker 生成命令

(2) OpenCL 问题，错误提示

Failed to load OpenCL runtime

此为 OpenCL 自身的问题，属正常情况。在 opencv>=3.4 后，不会有该提示。

(3) 因启动 Docker 时未加 gpu 参数而引起：

AssertionError: Found no NVIDIA driver on your system.

解决办法： docker run --gpus all -it [IMAGE]

参考资料：

[1] https://www.cnblogs.com/offduty/p/11797061.html

[2] https://github.com/facebookresearch/detectron2/issues/87

[3] https://github.com/NVIDIA/nvidia-docker

来源：oschina

链接：https://my.oschina.net/u/4395108/blog/3317512

标签

opencl

github

detectron

Docker

Ubuntu