Short version: can\'t we store variables in one of the workers and not use parameter servers?
Long version: I want to implement synchro
Another possibility is to use a distributed version of TensorFlow, which automatically handles the data distribution and execution on multiple nodes by using MPI in the backend.
We have recently developed one such version at MaTEx: https://github.com/matex-org/matex, and a paper describing https://arxiv.org/abs/1704.04560
It does synchronous training and provides several parallel dataset reader format.
We will be happy to help you if you need more help!