I\'ve been playing with the ATI OpenCL implementation in their Stream 2.0 beta. The OpenCL in the current beta only uses the CPU for now, the next version is supposed to support
I know this is an old question with old answers above. Thought I would update it with an up to date answer.
Yes, one implementation of OpenCL kernels and code will work on a wide variety of devices today with correctly written platform and device enumeration code. It is pretty easy to write correct platform and device enumeration code, the tricky part is selecting which platform or device. You should probably present a configuration option in your app where the user can select one, or run a microbenchmark against each one and dynamically select one and cache the bench result.
People can and will have more than one platform. For example, my system has GTX 580 SLI, so it has two devices in the NVidia platform. It also has the Intel OpenCL SDK, so my CoreI7 990x Extreme CPU also comes up as a device in the Intel platform.
Yes a binary developed and built using, for example, the NVidia OpenCL SDK, will work on ATI or Intel OpenCL, and vice versa. No need to worry about that anymore.
Obviously, an end user might have no OpenCL whatsoever, so you may need to delay-load or LoadLibrary opencl.dll and dynamic link.
I stronly suggest testing your code against the Intel OpenCL SDK, on NVidia GPUs, AND on AMD GPUs. You will probably find bugs that cause problems on one platform, but works fine on others. You will also probably find that totally fine code mysteriously doesn't give correct results on one of those platforms, due to driver bugs.