Tensorflow CUDA - CUPTI error: CUPTI could not be loaded or symbol could not be found

和自甴很熟 提交于 2020-01-31 07:56:08

问题


I use the Tensorflow v 1.14.0. I work on Windows 10. And here is how relevant environment variables look in the PATH:

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\libnvvp
C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common
C:\Users\sinthes\AppData\Local\Programs\Python\Python37
C:\Users\sinthes\AppData\Local\Programs\Python\Python37\Scripts
C:\Program Files\NVIDIA Corporation\NVIDIA NvDLISR
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\cuda\bin

Maybe also worth to mention, just in case it might be relevant.. I use Sublime Text 3 for development and I do not use Anaconda. I find it a bit cumbersome to make updates on tensorflow in the conda environment so I just use Sublime Text right now. (I was using Anaconda (Spyder) previously but I uninstalled it from my computer.)

Things seem to work fine except with some occasional strange warnings. But one consistent warning I get is the following whenever I run the fit function.

E tensorflow/core/platform/default/device_tracer.cc:68] CUPTI error: CUPTI could not be loaded or symbol could not be found.

And here is how I call the fit function:

history = model.fit(x=train_x,
                    y=train_y,
                    batch_size=BATCH_SIZE,
                    epochs=110,
                    verbose=2,
                    callbacks=[tensorboard, checkpoint, reduce_lr_on_plateau],
                    validation_data=(dev_x, dev_y),
                    shuffle=True,
                    class_weight=class_weight,
                    steps_per_epoch=None,
                    validation_steps=None)

I just wonder why I see the CUPTI Error message during the run time? It is only printed out once. Is that something that I need to fix or is it something that can be ignored? This message does not tell anything concrete to me to be able to take any action.


回答1:


Add this in path for Windows:

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\extras\CUPTI\libx64



回答2:


I had a similar error when trying to get tensorboard graph, I think it only affects you if you plan to use tensorboard.

I found the solution in this post but it is for linux https://gist.github.com/Brainiarc7/6d6c3f23ea057775b72c52817759b25c I think you need to create a library configuration file for cupti.




回答3:


The NVIDIA® CUDA Profiling Tools Interface (CUPTI) is a dynamic library that enables the creation of profiling and tracing tools that target CUDA applications.

CPUTI seems to have been added by the Tensorflow Developors to allow profiling. You can simply ignore the error if you don't mind the exception or adapt your environment path, so the dynamically linked library (DLL) can be found during execution.

Inside of you CUDA installation directory, there is an extras\CUPTI\lib64 directory that contains the cupti64_101.dll that is trying to be loaded. Adding that directory to your path should resolve the issue, e.g.,

SET PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\extras\CUPTI\lib64;%PATH%

N.B. in case you get an INSUFFICIENT_PRIVILEGES error next, try to run your program as administrator.




回答4:


This answer is for Ubuntu-16.04.

I had this issue when I upgraded to Tensorflow-1.14 with Python2.7 and Python3.6. I had to add /usr/local/cuda/extras/CUPTI/lib64 to LD_LIBRARY_PATH with export LD_LIBRARY_PATH=/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH and logout and login. source ~/.bashrc didn't help. Note that my cuda folder was pointing to cuda-10.0.




回答5:


Here is what solved "my" problem:

I just replaced my tensorflow v 1.14 with tensorflow v 1.13.1. And no more CUPTI error messages. And even some other strange warnings / problems have disappeared. All issues should obviously have specific reasons but Tensorflow (many times) unfortunately does not provide understandable error/warning messages that give a good/fair idea that helps to solve the issue. And I end up spending hours (even days) on such strange problems, that reduces my productivity significantly.

One general learning for me (that might be relevant to share here) is that I should not be in hurry to upgrade my tensorflow installation to the latest version of it. The latest one is almost never stable, whenever I made a try, I ended up spending significant amount of time on problems that are caused by tensorflow. Poor documentation and error messages make it very very difficult to work with.

If anyone has a better answer, s/he is more than welcome to share his/her insights on the issue I shared in this question.



来源:https://stackoverflow.com/questions/56860180/tensorflow-cuda-cupti-error-cupti-could-not-be-loaded-or-symbol-could-not-be

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!