nvidia

how to find the active SMs?

风流意气都作罢 提交于 2019-12-31 05:05:46
问题 Is there any way by which I can know the number of free/active SMs? Or atleast to read the voltage/power or temperature values of each SM by which I can know whether its working or not? (in real time while some job is getting executed on the gpu device). %smid helped me in knowing the Id of each SM. Something similar would be helpful. Thanks and Regards, Rakesh 回答1: The CUDA Profiling Tools Interface (CUPTI) contains an Events API that enables run time sampling of GPU PM counters. The CUPTI

How to allocate all available global memory on the GeForce GTX 690 device?

丶灬走出姿态 提交于 2019-12-31 03:48:26
问题 Now I need to allocate all available memory with cuda technology. I do it with Tesla C2050, Quadro 600 and GeForce GTX 560 Ti by: First, I allocate 0 bytes of global memory on device. Second step is define available memory of device by cudaMemGetInfo function and make allocation of that available memory. It works for devices listed above. But this mechanism doesn't work with GeForce GTX 690. Could somebody help me, what mechanism can I use to allocate memory on the GeForce GTX 690 device or

How can I force any display resolution/timing I want?

喜夏-厌秋 提交于 2019-12-30 13:27:17
问题 I am having trouble finding a way to force any display resolution/timing I want in my C# program. I am running Windows 7 with a GeForce 210 graphics card. My current method to achieve these custom resolutions is to use the driver GUI to manually add the custom resolutions and then use Windows calls to change to those resolutions but I need a way to add new custom resolutions in real time. I have looked into the NVAPI but I was not able to find a way to do this. I also looked into the command

How can I force any display resolution/timing I want?

依然范特西╮ 提交于 2019-12-30 13:26:24
问题 I am having trouble finding a way to force any display resolution/timing I want in my C# program. I am running Windows 7 with a GeForce 210 graphics card. My current method to achieve these custom resolutions is to use the driver GUI to manually add the custom resolutions and then use Windows calls to change to those resolutions but I need a way to add new custom resolutions in real time. I have looked into the NVAPI but I was not able to find a way to do this. I also looked into the command

Does NVidia support OpenCL SPIR?

眉间皱痕 提交于 2019-12-30 08:27:17
问题 I am wondering that whether nvidia supports spir backend or not? if yes, i couldn't find any document and sample example about that. but if not, is there a any way to work spir backend onto nvidia gpus? thanks in advance 回答1: Since SPIR builds on top of OpenCL version 1.2, and so far Nvidia has not made any OpenCL 1.2 drivers available, it is not possible to use SPIR with Nvidia GPUs. As mentioned in the comments, Nvidia has made PTX available as intermediate language (also based on LLVM IR).

CL_OUT_OF_RESOURCES for 2 millions floats with 1GB VRAM?

回眸只為那壹抹淺笑 提交于 2019-12-29 08:43:12
问题 It seems like 2 million floats should be no big deal, only 8MBs of 1GB of GPU RAM. I am able to allocate that much at times and sometimes more than that with no trouble. I get CL_OUT_OF_RESOURCES when I do a clEnqueueReadBuffer, which seems odd. Am I able to sniff out where the trouble really started? OpenCL shouldn't be failing like this at clEnqueueReadBuffer right? It should be when I allocated the data right? Is there some way to get more details than just the error code? It would be cool

Matrix-vector multiplication in CUDA: benchmarking & performance

烈酒焚心 提交于 2019-12-29 04:00:23
问题 I'm updating my question with some new benchmarking results (I also reformulated the question to be more specific and I updated the code)... I implemented a kernel for matrix-vector multiplication in CUDA C following the CUDA C Programming Guide using shared memory. Let me first present some benchmarking results which I did on a Jetson TK1 (GPU: Tegra K1, compute capability 3.2) and a comparison with cuBLAS: Here I guess cuBLAS does some magic since it seems that its execution is not affected

Error compiling CUDA from Command Prompt

[亡魂溺海] 提交于 2019-12-28 05:35:45
问题 I'm trying to compile a cuda test program on Windows 7 via Command Prompt, I'm this command: nvcc test.cu But all I get is this error: nvcc fatal : Cannot find compiler 'cl.exe' in PATH What may be causing this error? 回答1: You will need to add the folder containing the "cl.exe" file to your path environment variable. For example: C:\Program Files\Microsoft Visual Studio 10.0\VC\bin Edit : Ok, go to My Computer -> Properties -> Advanced System Settings -> Environment Variables. Here look for

Unable to create a working Makefile for CUDA C program

时光总嘲笑我的痴心妄想 提交于 2019-12-25 18:46:22
问题 I have a simple script formed by 3 CUDA files and 2 headers: main.cu , kernel.cu func.cu , kernel.h and func.h . Their goal is to calculate the sum of 2 vectors. // main.cu #include <stdio.h> #include <stdlib.h> #include <cuda_runtime.h> #include <cuda.h> #include "kernel.h" int main(){ /* Error code to check return values for CUDA calls */ cudaError_t err = cudaSuccess; srand(time(NULL)); int count = 100; int A[count], B[count]; int *h_A, *h_B; h_A = A; h_B = B; int i; for(i=0;i<count;i++){

Install dlib with cuda support ubuntu 18.04

我是研究僧i 提交于 2019-12-25 18:23:33
问题 I have CUDA 9.0 and CUDNN 7.1 installed on Ubuntu 18.04(Linux mint 19). Tensorflow-gpu works fine on GPU(GTX 1080ti). Now i am trying to build dlib with CUDA support: sudo python3 setup.py install --yes USE_AVX_INSTRUCTIONS --yes DLIB_USE_CUDA --clean Got the error: user@user-pc:~/Downloads/dlib$ sudo python3 setup.py install --yes USE_AVX_INSTRUCTIONS --yes DLIB_USE_CUDA --clean running install running bdist_egg running egg_info writing dlib.egg-info/PKG-INFO writing dependency_links to dlib