A bit of background:
I am getting started with GPGPU (OpenCL), I am using a java wrapper (jogamp.jocl) hoping that it will provide me with a way to abstract the low lev
i would say that similar patterns apply as for distributed computing for higher level abstractions. Not necessary the concurrency patters but all which help splitting tasks for parallel and independent execution. For example map/reduce. A CLCommandQueue would be used like a worker thread. Its basically just an interface to a abstract device (piece of hardware).
Architectural Patterns for Parallel Programming
Ralph Johnson on Parallel Programming Patterns