could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

前端未结

关注

 19  2400

I installed tensorflow 1.0.1 GPU version on my Macbook Pro with GeForce GT 750M. Also installed CUDA 8.0.71 and cuDNN 5.1. I am running a tf code that works fine with non C

相关标签:

19条回答

无人及你

2020-12-01 16:33

This is cudnn compatible issue. Check what you installed that is using the GPU for instance, tensorflow-gpu. What is the version? Is the version compatible with the version of your cudnn and is the cudnn installed the right version for your cuda?.

I have observed that: cuDNN v7.0.3 for Cuda 7.* cuDNN v7.1.2 for Cuda 9.0 cuDNN v7.3.1 for Cuda 9.1 and so on.

So also check the correct version of TensorFlow for your cuda configurations. For instance -using tensorflow-gpu: TF v1.4 for cudnn 7.0.* TF v1.7 and above for cudnn 9.0.*, etc.

So all you need to do is to reinstall the appropriate cudnn version. Hope it helps!

0 讨论(0)
发布评论:

提交评论
- 加载中...
悲哀的现实

2020-12-01 16:34
In Tensorflow 2.0, my issue was resolved by setting the memory growth. ConfigProto is deprecated in TF 2.0, I used tf.config.experimental. My computer specs are:
- OS: Ubuntu 18.04
- GPU: GeForce RTX 2070
- Nvidia Driver: 430.26
- Tensorflow: 2.0
- Cudnn: 7.6.2
- Cuda: 10.0
The code I used was:
```
physical_devices = tf.config.experimental.list_physical_devices('GPU')
assert len(physical_devices) > 0, "Not enough GPU hardware devices available"
config = tf.config.experimental.set_memory_growth(physical_devices[0], True)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

臣服心动

2020-12-01 16:35

To me, the 4th can nicely solve the problem. https://blog.csdn.net/comway_Li/article/details/102953634?utm_medium=distribute.pc_relevant.none-task-blog-baidujs-2

1.
    config = tf.ConfigProto()
    config.gpu_options.per_process_gpu_memory_fraction = 1.0
    session = tf.Session(config=config, ...)

2.
    config = tf.ConfigProto() 
    config.gpu_options.allow_growth = True 
    sess = tf.Session(config=config)

3.
    sudo rm -f ~/.nv 

4.
    from tensorflow.compat.v1 import ConfigProto
    from tensorflow.compat.v1 import InteractiveSession
    #from tensorflow import ConfigProto
    #from tensorflow import InteractiveSession
    config = ConfigProto()
    config.gpu_options.allow_growth = True
    session = InteractiveSession(config=config)

0 讨论(0)

北恋

2020-12-01 16:35
Rebooting the machine worked for me. Try this:
```
sudo reboot
```
Then, re-run the code
0 讨论(0)
发布评论:

提交评论
- 加载中...

长情又很酷

2020-12-01 16:38

I too encountered the same problem:

Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: 
name: GeForce GTX 1050
major: 6 minor: 1 memoryClockRate (GHz) 1.493 pciBusID 0000:01:00.0
Total memory: 3.95GiB
Free memory: 3.60GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0)
E tensorflow/stream_executor/cuda/cuda_dnn.cc:385] could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
E tensorflow/stream_executor/cuda/cuda_dnn.cc:352] could not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM
F tensorflow/core/kernels/conv_ops.cc:532] Check failed:  stream->parent()->GetConvolveAlgorithms(&algorithms)

Aborted (core dumped)

But in my case using sudo with the command worked perfectly fine.

0 讨论(0)

名媛妹妹

2020-12-01 16:41
I ran into the same problem because my GPU was running out of memory by some background zombie/terminated process, killing those processes works for me:
```
ps aux | grep 'Z' # Zombie
ps aux | grep 'T' # Terminated
kill -9 your_zombie_or_terminated_process_id
```
0 讨论(0)
发布评论:

提交评论
- 加载中...