could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

前端 未结 19 2402
故里飘歌
故里飘歌 2020-12-01 16:01

I installed tensorflow 1.0.1 GPU version on my Macbook Pro with GeForce GT 750M. Also installed CUDA 8.0.71 and cuDNN 5.1. I am running a tf code that works fine with non C

相关标签:
19条回答
  • 2020-12-01 16:26

    As strange as this may sound, try restarting your computer and rerun your model. If the model runs fine the issue is with your GPU memory allocation and tensorflows management of that available memory. On windows 10 i had two terminals open and closing one solved my problem. There could be open threads (zombie) that are still holding memory.

    0 讨论(0)
  • 2020-12-01 16:26

    Try this

    for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)
    tf.config.experimental.set_virtual_device_configuration(gpus[0], [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])
    
    
    0 讨论(0)
  • For me, re-running the CUDA installation as described here solved the problem:

    # Add NVIDIA package repository
    sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
    wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_9.1.85-1_amd64.deb
    sudo apt install ./cuda-repo-ubuntu1604_9.1.85-1_amd64.deb
    wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64/nvidia-machine-learning-repo-ubuntu1604_1.0.0-1_amd64.deb
    sudo apt install ./nvidia-machine-learning-repo-ubuntu1604_1.0.0-1_amd64.deb
    sudo apt update
    
    # Install CUDA and tools. Include optional NCCL 2.x
    sudo apt install cuda9.0 cuda-cublas-9-0 cuda-cufft-9-0 cuda-curand-9-0 \
        cuda-cusolver-9-0 cuda-cusparse-9-0 libcudnn7=7.2.1.38-1+cuda9.0 \
        libnccl2=2.2.13-1+cuda9.0 cuda-command-line-tools-9-0
    
    

    During the installation apt-get downgraded cudnn7 which I think is the culprit here. Probably it got updated accidentally with apt-get upgrade to a version which is incompatible with some other piece of the system.

    0 讨论(0)
  • 2020-12-01 16:27

    I had the same problem (Ubuntu 18.04). I was using:

    • tensorflow 2.1
    • cuda 10.1
    • cudnn 7.6.5

    I solved it uninstalling cuda and its folder, and installing it via apt following the tensorflow page instructions: https://www.tensorflow.org/install/gpu?hl=fr#ubuntu_1804_cuda_101

    0 讨论(0)
  • 2020-12-01 16:30

    For anyone getting this issue in Jupyter notebook:

    I was running two jupyter notebooks. After closing one of them the issue was solved.

    0 讨论(0)
  • 2020-12-01 16:32

    I also get same error, and I resolved the issue. My system properties were as follows:

    • Operating System: Ubuntu 14.04
    • GPU: GTX 1050Ti
    • Nvidia Driver: 375.66
    • Tensorflow: 1.3.0
    • Cudnn: 6.0.21 (cudnn-8.0-linux-x64-v6.0.deb)
    • Cuda: 8.0.61
    • Keras: 2.0.8

    How I solved the issue is as follows:

    1. I copied cudnn files to appropriate locations (/usr/local/cuda/include and /usr/local/cuda/lib64)
    2. I set the environment variables as:

      * export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"
      * export CUDA_HOME=/usr/local/cuda
      
    3. I also run sudo ldconfig -v command to cache the shared libraries for run time linker.

    I hope those steps will also help someone who is about to go crazy.

    0 讨论(0)
提交回复
热议问题