I\'m doubting whether tensorflow is correctly configured on my gpu box, since it\'s about 100x slower per iteration to train a simple linear regression model (batchsize = 32, 15
Extending Yaroslavs answer: Here is how to do the entire testing process (CUDA and cudNN installed already)
git clone https://github.com/tensorflow/models.git
Create a Virtual Environment for tensorflow and install tensorflow
virtualenv --system-site-packages -p python3 tf-venv3
source tf-venv3/bin/activate
pip install --upgrade pip
pip install --upgrade tensorflow-gpu
Run the model within your Virtual Environment
python models/tutorials/image/mnist/convolutional.py
My GTX 1070 needs ~5ms per step
Note: On Geforce 1050 Ti it takes ~10ms per step