I have seen some discussion lately about whether there is a difference between a counter implemented using atomic increment/load, and one using a mutex to synchronise increment/
There is no difference in behavior. There is a difference in performance.
Mutexes are slow, due to the setup and teardown, and due to the fact that they block other goroutines for the duration of the lock.
Atomic operations are fast because they use an atomic CPU instruction, rather than relying on external locks to.
Therefore, whenever it is feasible, atomic operations should be preferred.