I\'m trying to implement Floyd Warshall algorithm using cuda but I\'m having syncrhornization problem. This is my code:
__global__ void run_on_gpu(const int g