nvidia

How does one have TensorFlow not run the script unless the GPU was loaded successfully?

心不动则不痛 提交于 2019-12-24 07:44:24
问题 I have been trying to run some TensorFlow training on some machine with GPUs however, whenever I try to do so I get some type of error that seems to say it wasn't able to use the GPU for some reason (usually memory issue, or cuda issue or cudnn etc). However, since the thing TensorFlow does automatically is to just run in CPU if it can't use the GPU its been hard to tell for me if it was actually able to leverage the GPU or not. Thus, I wanted to have my script just fail/halt unless the GPU

How does one have TensorFlow not run the script unless the GPU was loaded successfully?

若如初见. 提交于 2019-12-24 07:43:04
问题 I have been trying to run some TensorFlow training on some machine with GPUs however, whenever I try to do so I get some type of error that seems to say it wasn't able to use the GPU for some reason (usually memory issue, or cuda issue or cudnn etc). However, since the thing TensorFlow does automatically is to just run in CPU if it can't use the GPU its been hard to tell for me if it was actually able to leverage the GPU or not. Thus, I wanted to have my script just fail/halt unless the GPU

Information/example on Unified Virtual Addressing (UVA) in CUDA

痞子三分冷 提交于 2019-12-24 01:43:30
问题 I'm trying to understand the concept of Unified Virtual Addressing (UVA) in CUDA. I have two questions: Is there any sample (psudo)code available that demonstrates this concept? I read in the CUDA C Programming Guide that UVA can be used only with 64 bit operating systems. Why it is so? 回答1: A unified virtual address space combines the pointer (values) and allocation mappings used in device code with the pointer (values) and allocation mappings used in host code into a single unified space. 1

dlib not using CUDA

房东的猫 提交于 2019-12-24 00:23:36
问题 I installed dlib using pip. my graphic card supports CUDA, but while running dlib, it is not using GPU. Im working on ubuntu 18.04 Python 3.6.5 (default, Apr 1 2018, 05:46:30) [GCC 7.3.0] on linux >>> import dlib >>> dlib.DLIB_USE_CUDA False I have also installed the NVidia Cuda Compile driver but still it is not working. nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2017 NVIDIA Corporation Built on Fri_Nov__3_21:07:56_CDT_2017 Cuda compilation tools, release 9.1, V9

Cholesky decomposition with CUDA

让人想犯罪 __ 提交于 2019-12-23 19:32:16
问题 I am trying to implement Cholesky decomposition using the cuSOLVER library. I am a beginner CUDA programmer and I have always specified block-sizes and grid-sizes, but I am not able to find out how this can be set explicitly by the programmer with cuSOLVER functions. Here is the documentation: http://docs.nvidia.com/cuda/cusolver/index.html#introduction The QR decomposition is implemented using the cuSOLVER library (see the example here: http://docs.nvidia.com/cuda/cusolver/index.html#ormqr

Number of total threads, blocks, and grids on my GPU.

点点圈 提交于 2019-12-23 16:34:32
问题 For the NVIDIA GEFORCE 940mx GPU, Device Query shows it has 3 Multiprocessor and 128 cores for each MP. Number of threads per multiprocessor=2048 So, 3*2048=6144.ie. total 6144 threads in GPU. 6144/1024=6 ,ie. total 6 blocks. And warp size is 32. But from this video https://www.youtube.com/watch?v=kzXjRFL-gjo i found that each GPU has limit on threads, but no limit on Number of blocks. So i got confused with this. I would like to know How many total threads are in my GPU? Can we use all

Nvidia visual studio Nsight CPU and GPU debugging

假如想象 提交于 2019-12-23 12:53:44
问题 The NVIDIA Nsight Visual Studio Edition does not seem to be capable of debugging CPU (host code) and GPU (cuda code) at the same time. With the Nsight Eclipse Edition (or cuda-gdb) this is quite simple, for example, you can "step in" to a CUDA kernel from the host execution. How to do the same with Visual Studio? 回答1: From the Nsight manual It says Use a separate Visual Studio instance to debug the host portion of a target application. If you wish to debug the host portion of your CUDA

sampler1D not supported in nVidia GLSL?

守給你的承諾、 提交于 2019-12-23 09:18:33
问题 In the GLSL spec, and other sources about GLSL, sampler types are available in 3 dimensions: sampler1D , sampler2D , and sampler3D . However when I try to compile GLSL programs using WebGL in Chrome (both regular, and also in Canary), sampler2D and sampler3D are accepted but sampler1D gives a syntax error. Code: uniform sampler1D tex1; Error: FS ERROR: ERROR: 0:9: 'sampler1D' : syntax error This error occurs even if I give Canary the command line argument --use-gl=desktop . I am running

Is it possible to call cufft library calls in device function?

こ雲淡風輕ζ 提交于 2019-12-23 07:47:32
问题 I use the cuFFT library calls in a host code they work fine, but I want to call the cuFFT library from a kernel. Earlier versions of the CUDA didn't have this kind of support but with the dynamic parallelism is this possible ? It will be great if there are any examples on how to achieve this. 回答1: Despite the introduction of dynamic parallelism on Kepler (cc 3.5) cards, cuFFT remains a host API and there is currently no way of creating or executing FFT operations in device code using cuFFT.

CUDA - Memory Limit - Vector Summation

旧时模样 提交于 2019-12-23 05:18:17
问题 I'm trying to learn CUDA and the following code works OK for the values N<= 16384, but fails for the greater values(Summation check at the end of the code fails, c values are always 0 for the index value of i>=16384). #include<iostream> #include"cuda_runtime.h" #include"../cuda_be/book.h" #define N (16384) __global__ void add(int *a,int *b,int *c) { int tid = threadIdx.x + blockIdx.x * blockDim.x; if(tid<N) { c[tid] = a[tid] + b[tid]; tid += blockDim.x * gridDim.x; } } int main() { int a[N],b