Consider the following problem. You have a bit-string that represents the current scheduled slave in one-hot encoding. For example, \"00000100\" (with the leftmost bit being #7
This should do what you want:
number_of_tasks= <number of tasks, in the example this is 8>
next_mask= current | (current - 1);
next_barrel= next | (next << number_of_tasks);
next_barrel&= ~number_of_tasks;
next_barrel&= -next_barrel;
next_barrel|= next_barrel >> number_of_tasks;
next_task_mask= next_barrel & -next_barrel;
Basically, duplicate the bits of the next task mask, mask off the bits we don't want to consider, find the lowest set bit, fold the high bits back in, then take the lowest bit set. This runs in constant time.
Edit: Updating to take into account current == 00010000 and next_mask == 00111000
Complete parametrizable arbiter implementation that can be configured for round-robin or priority arbitration:
https://github.com/alexforencich/verilog-axis/blob/master/rtl/arbiter.v
This design uses a pair of priority encoders to select the next output in the sequence. The priority encoders used are implemented efficiently as trees.
Untested, but off the top of my head, I'd be surprised if this didn't produce ma reasonable synthesis... Has the advantage of being relatively readable (to me anyway) unlike typical bit-twiddling hacks.
for i in current'range loop
current := rotate_left(current, 1);
if or_reduce(mask and current) = '1' then
current:= mask and current;
end if;
end loop;