The below code works on a single GPU but throws an error while using multiple gpus RuntimeError: grad can be implicitly created only for scalar outputs
code