Can I run Cuda or opencl on intel iris?

后端未结

关注

 2  1747

予麋鹿 2021-02-03 11:43

I have a Macbook pro mid 2014 with intel iris and intel core i5 processor 16GB of RAM. I am planing to learn some ray-traced 3D. But, I am not sure, if my laptop can render fast

2条回答

粉色の甜心 (楼主)

2021-02-03 12:01
Cuda works only on nvidia hardware but there may be some libraries converting it to run on cpu cores(not igpu).

AMD is working on "hipify"ing old cuda kernels to translate them to opencl or similar codes so they can become more general.

Opencl works everywhere as long as both hardware and os supports. Amd, Nvidia, Intel, Xilinx, Altera, Qualcomm, MediaTek, Marvell, Texas Instruments .. support this. Maybe even Raspberry pi-x can support in future.

Documentation for opencl in stackoverflow.com is under development. But there are some sites:
- Amd's tutorial
- Amd's parallel programming guide for opencl
- Nvidia's learning material
- Intel's HD graphics coding tutorial
- Some overview of hardware, benchmark and parallel programming subjects
- blog
- Scratch-a-pixel-raytracing-tutorial (I read it then wrote its teraflops gpu version)
If it is Iris Graphics 6100:

Your integrated gpu has 48 execution units each having 8 ALU units that can do add,multiply and many more operations. Its clock frequency can rise to 1GHz. This means a maximum of 48*8*2(1 add+1multiply)*1G = 768 Giga floating point operations per second but only if each ALU is capable of concurrently doing 1 addition and 1 multiplication. 768 Gflops is more than a low-end discrete gpu such as R7-240 of AMD.(As of 19.10.2017, AMD's low-end is RX550 with 1200 GFlops, faster than Intel's Iris Plus 650 which is nearly 900 GFlops). Ray tracing needs re-accessing to too many geometry data so a device should have its own memory(such as with Nvidia or Amd), to let CPU do its work.

How you install opencl on a computer can change by OS and hardware type, but building a software with an opencl-installed computer is similar:
- Query platforms. Result of this can be AMD, Intel, Nvidia,duplicate of these because of overlapped installations of wrong drivers,experimental platforms prior to newer opencl version supports.
- Query devices of a platform(or all platforms). This gives individual devices (and their duplicates if there are driver errors or some other things to fix).
- Create a context(or multiple) using a platform
- Using a context(so everything will have implicit sync in it):
  - Build programs using kernel strings. Usually CPU can take less time than a GPU to build a program.(there is binary load option to shurtcut this)
  - Build kernels(as objects now) from programs.
  - Create buffers from host-side buffers or opencl-managed buffers.
  - Create a command queue (or multiple)
Just before computing(or an array of computations):
- Select buffers for a kernel as its arguments.
- Enqueue buffer write(or map/unmap) operations on "input" buffers
Compute:
- Enqueue nd range kernel(with specifying which kernel runs and with how many threads)
- Enqueue buffer read(or map/unmap) operations on "output" buffers
- Don't forget to synchronize with host using clFinish() if you haven't used blocking type enqueueBufferRead.
- Use your accelerated data.
After opencl is no more needed:
- Be sure all command queues are empty / finished doing kernel work.
- Release all in the opposite order of creation
If you need to accelerate an open source software, you can switch a hotspot parallelizable loop with a simple opencl kernel, if it doesn't have another acceleration support already. For example, you can accelerate air-pressure and heat-advection part of powdertoy sand-box simulator.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...