I installed tensorflow 1.0.1 GPU version on my Macbook Pro with GeForce GT 750M. Also installed CUDA 8.0.71 and cuDNN 5.1. I am running a tf code that works fine with non C
This is cudnn compatible issue. Check what you installed that is using the GPU for instance, tensorflow-gpu
. What is the version? Is the version compatible with the version of your cudnn
and is the cudnn installed the right version for your cuda?.
I have observed that:
cuDNN v7.0.3
for Cuda 7.*
cuDNN v7.1.2
for Cuda 9.0
cuDNN v7.3.1
for Cuda 9.1
and so on.
So also check the correct version of TensorFlow for your cuda configurations.
For instance -using tensorflow-gpu
:
TF v1.4
for cudnn 7.0.*
TF v1.7
and above for cudnn 9.0.*
, etc.
So all you need to do is to reinstall the appropriate cudnn version. Hope it helps!
In Tensorflow 2.0, my issue was resolved by setting the memory growth. ConfigProto is deprecated in TF 2.0, I used tf.config.experimental. My computer specs are:
The code I used was:
physical_devices = tf.config.experimental.list_physical_devices('GPU')
assert len(physical_devices) > 0, "Not enough GPU hardware devices available"
config = tf.config.experimental.set_memory_growth(physical_devices[0], True)
To me, the 4th can nicely solve the problem. https://blog.csdn.net/comway_Li/article/details/102953634?utm_medium=distribute.pc_relevant.none-task-blog-baidujs-2
1.
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 1.0
session = tf.Session(config=config, ...)
2.
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)
3.
sudo rm -f ~/.nv
4.
from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession
#from tensorflow import ConfigProto
#from tensorflow import InteractiveSession
config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)
Rebooting the machine worked for me. Try this:
sudo reboot
Then, re-run the code
I too encountered the same problem:
Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: GeForce GTX 1050
major: 6 minor: 1 memoryClockRate (GHz) 1.493 pciBusID 0000:01:00.0
Total memory: 3.95GiB
Free memory: 3.60GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0)
E tensorflow/stream_executor/cuda/cuda_dnn.cc:385] could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
E tensorflow/stream_executor/cuda/cuda_dnn.cc:352] could not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM
F tensorflow/core/kernels/conv_ops.cc:532] Check failed: stream->parent()->GetConvolveAlgorithms(&algorithms)
Aborted (core dumped)
But in my case using sudo with the command worked perfectly fine.
I ran into the same problem because my GPU was running out of memory by some background zombie/terminated process, killing those processes works for me:
ps aux | grep 'Z' # Zombie
ps aux | grep 'T' # Terminated
kill -9 your_zombie_or_terminated_process_id