How can a large number of assignments to the same array cause a pyopencl.LogicError when run on GPU?

。_饼干妹妹 提交于 2019-12-02 07:13:17

My advice is to avoid such a long loops inside a kernel. Work Item is making over 1 billion of iterations, and that's a long shot. Probably, driver kills your kernel as it takes too much time to execute. Reduce the number of iterations to the maximal amount, which doesn't lead to error and look at the execution time. If it takes something like seconds - that's too much.

As you said, reducing iterations numbers solves the problem and that's the evidence in my opinion. Reducing the number of assignment operations also makes kernel runs faster as IO operations are usually the slowest.

CPU doesn't face such difficulties for obvious reasons.

neXus

This timeout problem can be fixed in Windows and Linux, but apparently not in Mac.


Windows

This answer to a similar question (explaining the symptoms in Windows) tells both what is going on and how to fix it:

This is a known "feature" under Windows (not sure about Linux) - if the video driver stops responding, the OS will reset it. Except that, since OpenCL (and CUDA) is implemented by the driver, a kernel that takes too long will look like a frozen driver. There is a watchdog timer that keeps track of this (5 seconds, I believe).

Your options are:

  1. You need to make sure that your kernels are not too time-consuming (best).
  2. You can turn-off the watchdog timer: Timeout Detection and Recovery of GPUs.
  3. You can run the kernel on a GPU that is not hooked up to a display.

I suggest you go with 1.

This answer explains how to actually do (2) in Windows 7. But the MSDN-page for these registry keys mentions they should not be manipulated by any applications outside targeted testing or debugging. So it might not be the best option, but it is an option.


Linux

(From Cuda Release Notes, but also applicable to OpenCL)

GPUs without a display attached are not subject to the 5 second run time restriction. For this reason it is recommeded that CUDA is run on a GPU that is NOT attached to an X display.

While X does not need to be running in order to use CUDA, X must have been initialized at least once after booting in order to properly load the NVIDIA kernel module. The NVIDIA kernel module remains loaded even after X shuts down, allowing CUDA to continue to function.


Mac

Apple apparently does not allow fiddling with this watchdog and thus the only option seems to be using a second GPU (without a screen attached to it)

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!