Broadly there are two ways:
Call loss.backward() on every batch, but only call optimizer.step() and optimizer.zero_grad()
loss.backward()
optimizer.step()
optimizer.zero_grad()