I want to synchronize data access for CUDA threads using Global device memory. My current implementation is not making available changes made to variable to other threads in the