I am writing a fluid simulator for my personal project and I wrote one using numpy and now I want to use cupy to make it faster. The main algo of my fluid simulator is