Tensorflow won't build with CUDA support

為{幸葍}努か 提交于 2019-12-07 22:59:57

问题


I've tried building tensorflow from source as described in the installation guide. I've had success building it with cpu-only support and with the SIMD instruction sets, but I've run into trouble trying to build with CUDA support.

System information:

  • Mint 18 Sarah
  • 4.4.0-21-generic
  • gcc 5.4.0
  • clang 3.8.0
  • Python 3.6.1
  • Nvidia GeForece GTX 1060 6GB (Compute capability 6.1)
  • CUDA 8.0.61
  • CuDNN 6.0

Here's my attempt at building with CUDA, gcc, and SIMD:

kevin@yeti-mint ~/src/tensorflow $ bazel clean
INFO: Starting clean (this may take a while). Consider using --async if the clean takes more than several minutes.
kevin@yeti-mint ~/src/tensorflow $ ./configure 
You have bazel 0.5.2 installed.
Please specify the location of python. [Default is /home/kevin/.pyenv/shims/python]: 
Found possible Python library paths:
  /home/kevin/.pyenv/versions/tensorflow/lib/python3.6/site-packages
Please input the desired Python library path to use.  Default is [/home/kevin/.pyenv/versions/tensorflow/lib/python3.6/site-packages]
/home/kevin/.pyenv/versions/3.6.1/lib/python3.6
Do you wish to build TensorFlow with MKL support? [y/N] 
No MKL support will be enabled for TensorFlow
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: 
Do you wish to use jemalloc as the malloc implementation? [Y/n] 
jemalloc enabled
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] 
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N] 
No Hadoop File System support will be enabled for TensorFlow
Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N] 
No XLA support will be enabled for TensorFlow
Do you wish to build TensorFlow with VERBS support? [y/N] 
No VERBS support will be enabled for TensorFlow
Do you wish to build TensorFlow with OpenCL support? [y/N] 
No OpenCL support will be enabled for TensorFlow
Do you wish to build TensorFlow with CUDA support? [y/N] y
CUDA support will be enabled for TensorFlow
Do you want to use clang as CUDA compiler? [y/N] 
nvcc will be used as CUDA compiler
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 8.0]: 
Please specify the location where CUDA  toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: 
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 6.0]: 
Please specify the location where cuDNN  library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "6.1"]: 
Do you wish to build TensorFlow with MPI support? [y/N] 
MPI support will not be enabled for TensorFlow
Configuration finished
kevin@yeti-mint ~/src/tensorflow $ bazel build --config=opt --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2 --verbose_failures //tensorflow/tools/pip_package:build_pip_package
WARNING: /home/kevin/src/tensorflow/tensorflow/contrib/learn/BUILD:15:1: in py_library rule //tensorflow/contrib/learn:learn: target '//tensorflow/contrib/learn:learn' depends on deprecated target '//tensorflow/contrib/session_bundle:exporter': Use SavedModel Builder instead.
WARNING: /home/kevin/src/tensorflow/tensorflow/contrib/learn/BUILD:15:1: in py_library rule //tensorflow/contrib/learn:learn: target '//tensorflow/contrib/learn:learn' depends on deprecated target '//tensorflow/contrib/session_bundle:gc': Use SavedModel instead.
INFO: Found 1 target...
ERROR: /home/kevin/.cache/bazel/_bazel_kevin/b937ae7b9a1087aeb7862ab37155238c/external/protobuf/BUILD:244:1: C++ compilation of rule '@protobuf//:js_embed' failed: crosstool_wrapper_driver_is_not_gcc failed: error executing command 
  (cd /home/kevin/.cache/bazel/_bazel_kevin/b937ae7b9a1087aeb7862ab37155238c/execroot/org_tensorflow && \
  exec env - \
    PATH=/home/kevin/.pyenv/shims:/home/kevin/.pyenv/shims:/home/kevin/.pyenv/bin:/home/kevin/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/home/kevin/.local/bin \
    PWD=/proc/self/cwd \
  external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -fPIE -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 -DNDEBUG -ffunction-sections -fdata-sections -g0 '-std=c++11' -g0 -MD -MF bazel-out/host/bin/external/protobuf/_objs/js_embed/external/protobuf/src/google/protobuf/compiler/js/embed.d '-frandom-seed=bazel-out/host/bin/external/protobuf/_objs/js_embed/external/protobuf/src/google/protobuf/compiler/js/embed.o' -iquote external/protobuf -iquote bazel-out/host/genfiles/external/protobuf -iquote external/bazel_tools -iquote bazel-out/host/genfiles/external/bazel_tools -isystem external/bazel_tools/tools/cpp/gcc3 -no-canonical-prefixes -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -fno-canonical-system-headers -c external/protobuf/src/google/protobuf/compiler/js/embed.cc -o bazel-out/host/bin/external/protobuf/_objs/js_embed/external/protobuf/src/google/protobuf/compiler/js/embed.o): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 2.
python: can't open file 'external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc': [Errno 2] No such file or directory
Target //tensorflow/tools/pip_package:build_pip_package failed to build
INFO: Elapsed time: 5.578s, Critical Path: 0.06s

Turning off all extra flags:

kevin@yeti-mint ~/src/tensorflow $ bazel build --config=opt --verbose_failures //tensorflow/tools/pip_package:build_pip_packageWARNING: /home/kevin/src/tensorflow/tensorflow/contrib/learn/BUILD:15:1: in py_library rule //tensorflow/contrib/learn:learn: target '//tensorflow/contrib/learn:learn' depends on deprecated target '//tensorflow/contrib/session_bundle:exporter': Use SavedModel Builder instead.
WARNING: /home/kevin/src/tensorflow/tensorflow/contrib/learn/BUILD:15:1: in py_library rule //tensorflow/contrib/learn:learn: target '//tensorflow/contrib/learn:learn' depends on deprecated target '//tensorflow/contrib/session_bundle:gc': Use SavedModel instead.
INFO: Found 1 target...
ERROR: /home/kevin/.cache/bazel/_bazel_kevin/b937ae7b9a1087aeb7862ab37155238c/external/fft2d/BUILD.bazel:21:1: C++ compilation of rule '@fft2d//:fft2d' failed: crosstool_wrapper_driver_is_not_gcc failed: error executing command 
  (cd /home/kevin/.cache/bazel/_bazel_kevin/b937ae7b9a1087aeb7862ab37155238c/execroot/org_tensorflow && \
  exec env - \
    PATH=/home/kevin/.pyenv/shims:/home/kevin/.pyenv/shims:/home/kevin/.pyenv/bin:/home/kevin/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/home/kevin/.local/bin \
    PWD=/proc/self/cwd \
  external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -fPIE -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 -DNDEBUG -ffunction-sections -fdata-sections -g0 -MD -MF bazel-out/host/bin/external/fft2d/_objs/fft2d/external/fft2d/fft/fftsg.d -iquote external/fft2d -iquote bazel-out/host/genfiles/external/fft2d -iquote external/bazel_tools -iquote bazel-out/host/genfiles/external/bazel_tools -isystem external/bazel_tools/tools/cpp/gcc3 -no-canonical-prefixes -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -fno-canonical-system-headers -c external/fft2d/fft/fftsg.c -o bazel-out/host/bin/external/fft2d/_objs/fft2d/external/fft2d/fft/fftsg.o): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 2.
python: can't open file 'external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc': [Errno 2] No such file or directory
Target //tensorflow/tools/pip_package:build_pip_package failed to build
INFO: Elapsed time: 3.522s, Critical Path: 2.42s

Trying with clang instead:

kevin@yeti-mint ~/src/tensorflow $ ./configure 
You have bazel 0.5.2 installed.
Please specify the location of python. [Default is /home/kevin/.pyenv/shims/python]: 
Found possible Python library paths:
  /home/kevin/.pyenv/versions/tensorflow/lib/python3.6/site-packages
Please input the desired Python library path to use.  Default is [/home/kevin/.pyenv/versions/tensorflow/lib/python3.6/site-packages]
/home/kevin/.pyenv/versions/3.6.1/lib/python3.6
Do you wish to build TensorFlow with MKL support? [y/N] 
No MKL support will be enabled for TensorFlow
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: 
Do you wish to use jemalloc as the malloc implementation? [Y/n] 
jemalloc enabled
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] 
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N] 
No Hadoop File System support will be enabled for TensorFlow
Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N] 
No XLA support will be enabled for TensorFlow
Do you wish to build TensorFlow with VERBS support? [y/N] 
No VERBS support will be enabled for TensorFlow
Do you wish to build TensorFlow with OpenCL support? [y/N] 
No OpenCL support will be enabled for TensorFlow
Do you wish to build TensorFlow with CUDA support? [y/N] y
CUDA support will be enabled for TensorFlow
Do you want to use clang as CUDA compiler? [y/N] y
Clang will be used as CUDA compiler
Please specify which clang should be used as device and host compiler. [Default is /usr/bin/clang]: 
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 8.0]: 
Please specify the location where CUDA  toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 6.0]: 
Please specify the location where cuDNN  library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "6.1"]: 
Do you wish to build TensorFlow with MPI support? [y/N] 
MPI support will not be enabled for TensorFlow
Configuration finished
kevin@yeti-mint ~/src/tensorflow $ bazel build --config=opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-msse4.2 --verbose_failures //tensorflow/tools/pip_package:build_pip_package
WARNING: /home/kevin/src/tensorflow/tensorflow/contrib/learn/BUILD:15:1: in py_library rule //tensorflow/contrib/learn:learn: target '//tensorflow/contrib/learn:learn' depends on deprecated target '//tensorflow/contrib/session_bundle:exporter': Use SavedModel Builder instead.
WARNING: /home/kevin/src/tensorflow/tensorflow/contrib/learn/BUILD:15:1: in py_library rule //tensorflow/contrib/learn:learn: target '//tensorflow/contrib/learn:learn' depends on deprecated target '//tensorflow/contrib/session_bundle:gc': Use SavedModel instead.
INFO: Found 1 target...

~1300 lines of build warnings and info...

ERROR: /home/kevin/.cache/bazel/_bazel_kevin/b937ae7b9a1087aeb7862ab37155238c/external/nccl_archive/BUILD:33:1: C++ compilation of rule '@nccl_archive//:nccl' failed: clang failed: error executing command 
  (cd /home/kevin/.cache/bazel/_bazel_kevin/b937ae7b9a1087aeb7862ab37155238c/execroot/org_tensorflow && \
  exec env - \
    CLANG_CUDA_COMPILER_PATH=/usr/bin/clang \
    CUDA_TOOLKIT_PATH=/usr/local/cuda \
    CUDNN_INSTALL_PATH=/usr/local/cuda-8.0 \
    PWD=/proc/self/cwd \
    PYTHON_BIN_PATH=/home/kevin/.pyenv/shims/python \
    PYTHON_LIB_PATH=/home/kevin/.pyenv/versions/3.6.1/lib/python3.6 \
    TF_CUDA_CLANG=1 \
    TF_CUDA_COMPUTE_CAPABILITIES=6.1 \
    TF_CUDA_VERSION=8.0 \
    TF_CUDNN_VERSION=6 \
    TF_NEED_CUDA=1 \
    TF_NEED_OPENCL=0 \
  /usr/bin/clang '-march=native' -mavx -mavx2 -mfma -msse4.2 '-march=native' -MD -MF bazel-out/local_linux-py3-opt/bin/external/nccl_archive/_objs/nccl/external/nccl_archive/src/reduce.cu.pic.d '-frandom-seed=bazel-out/local_linux-py3-opt/bin/external/nccl_archive/_objs/nccl/external/nccl_archive/src/reduce.cu.pic.o' -iquote external/nccl_archive -iquote bazel-out/local_linux-py3-opt/genfiles/external/nccl_archive -iquote external/local_config_cuda -iquote bazel-out/local_linux-py3-opt/genfiles/external/local_config_cuda -iquote external/bazel_tools -iquote bazel-out/local_linux-py3-opt/genfiles/external/bazel_tools -isystem external/local_config_cuda/cuda -isystem bazel-out/local_linux-py3-opt/genfiles/external/local_config_cuda/cuda -isystem external/local_config_cuda/cuda/include -isystem bazel-out/local_linux-py3-opt/genfiles/external/local_config_cuda/cuda/include -isystem external/bazel_tools/tools/cpp/gcc3 '-std=c++11' -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -fPIC -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -Wno-invalid-partial-specialization -fno-omit-frame-pointer -no-canonical-prefixes -DNDEBUG -g0 -O2 -ffunction-sections -fdata-sections '-DCUDA_MAJOR=0' '-DCUDA_MINOR=0' '-DNCCL_MAJOR=0' '-DNCCL_MINOR=0' '-DNCCL_PATCH=0' -Iexternal/nccl_archive/src -O3 -x cuda '-DGOOGLE_CUDA=1' '--cuda-gpu-arch=sm_61' -c bazel-out/local_linux-py3-opt/genfiles/external/nccl_archive/src/reduce.cu.cc -o bazel-out/local_linux-py3-opt/bin/external/nccl_archive/_objs/nccl/external/nccl_archive/src/reduce.cu.pic.o): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
clang: error: Unsupported CUDA gpu architecture: sm_61
Target //tensorflow/tools/pip_package:build_pip_package failed to build
INFO: Elapsed time: 25.030s, Critical Path: 12.66s

This is consistent behavior on the current master branch (31aa360), r1.2 (5d8c0a6), and r1.1 (8ddd727). I've seen many github issues (8790, 9651, 10367) and a stack overflow post or two (here, I tried using gcc/g++ 4.8), but they all seem to be solved and/or slightly unrelated to my problem.

来源:https://stackoverflow.com/questions/44830934/tensorflow-wont-build-with-cuda-support

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!