I got a question related to the new compute shaders. I am currently working on a particle system. I store all my particles in shader-storage-buffer to access them in the compute shader. Then I dispatch an one dimensional work group.
#define WORK_GROUP_SIZE 128
_shaderManager->useProgram("computeProg");
glDispatchCompute((_numParticles/WORK_GROUP_SIZE), 1, 1);
glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT);
My compute shader:
#version 430
struct particle{
vec4 currentPos;
vec4 oldPos;
};
layout(std430, binding=0) buffer particles{
struct particle p[];
};
layout (local_size_x = 128, local_size_y = 1, local_size_z = 1) in;
void main(){
uint gid = gl_GlobalInvocationID.x;
p[gid].currentPos.x += 100;
}
But somehow not all particles are affected. I am doing it the same way it was done in this example, but it doesn't work. http://education.siggraph.org/media/conference/S2012_Materials/ComputeShader_6pp.pdf
Edit:
After I called glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT) I go on like this:
_shaderManager->useProgram("shaderProg");
glBindBuffer(GL_ARRAY_BUFFER, shaderStorageBufferID);
glVertexPointer(4,GL_FLOAT,sizeof(glm::vec4), (void*)0);
glEnableClientState(GL_VERTEX_ARRAY);
glDrawArrays(GL_POINTS, 0, _numParticles);
glDisableClientState(GL_VERTEX_ARRAY);
So which bit would be appropriate to use in this case?
You have your barriers on backwards. It's a common problem.
The bits you give to the barrier describe how you intend to use the data written, not how the data was written. GL_SHADER_STORAGE_BARRIER_BIT
would only be appropriate if you had some process that wrote to a buffer object via image load/store (or a storage buffer/atomic counters), then used a storage buffer to read that buffer object data.
Since you're reading the buffer as a vertex attribute array buffer, you should use the cleverly titled, GL_VERTEX_ATTRIB_ARRAY_BARRIER_BIT
.
I resolved the problem. The problem was just the number of work-groups I dispatched. numParticles/WORK_GROUP_SIZE will be round off because both variables are integers. That caused too little dispatched work-groups with different numbers of particles.
When I got 1000 particles, then only 1000/128 = 7 work-groups are dispatched. Every work-group has the size of 128. That means I get 7*128 = 896 threads and thus 104 particles won't move at all. Since numParticles%128 may range from 0...128 I just dispatched one more work-group:
glDispatchCompute((_numParticles/WORK_GROUP_SIZE)+1, 1, 1);
And every particle moves from now on. :)
来源:https://stackoverflow.com/questions/12742142/opengl-compute-shader-invocations