In OpenCL is there a max size X, Y, and Z can be when using attribute((reqd_work_group_size(X, Y, Z))) in kernel code?