Quite often, I get the CUDA library to completely fail and return with an error 46 (\"all CUDA-capable devices are busy or unavailable\") even for simple calls like cudaMalloc.