Tensorflow same code but get different result from CPU device to GPU device

妖精的绣舞 提交于 2019-12-06 11:21:31

问题


I am trying to implement a program to test the Tensorflow performance on GPU device. Data test is MNIST data, supervised training using Multilayer perceptron(Neural networks). I followed this simple example but I change the number of performance batch gradient to 10000

for i in range(10000) :
batch_xs, batch_ys = mnist.train.next_batch(100)
sess.run(train_step,feed_dict={x : batch_xs, y_ : batch_ys})
if i % 500 == 0:
    print(i)

Eventually, when I check the predict accuracy using this code

correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction,"float"))
print(sess.run(accuracy,feed_dict={x:mnist.test.images,y_:mnist.test.labels}))
print(tf.convert_to_tensor(mnist.test.images).get_shape())

it turns out that the accuracy rate is different from CPU to GPU: when GPU returns the accuracy rate approximately 0.9xx while CPU returns only 0.3xx. Does anyone know the reason? or why can that issue happen?


回答1:


There are two primary reasons for this kind of behavior (besides bugs).

Numerical stability

It turns out that adding numbers is not entirely as easy as it might seem. Let's say I want to add a trillion 2's together. The correct answer is two trillion. But if you add these together in floating point on a machine with a wordsize of only, say 32 bits, after a while, your answer will get stuck at a smaller value. The reason is that after a while, the 2's that you're adding are below the smallest bit of the mantissa of the floating point sum.

These kinds of issues abound in numerical computing, and this particular discrepancy is known in TensorFlow (1,2, to name a few). It's possible that you're seeing an effect of this.

Initial conditions

Training a neural nets is a stochastic process, and as such, it depends on your initial conditions. Sometimes, especially if your hyperparameters are not tuned very well, your net will get stuck near a poor local minima, and you'll end up with mediocre behavior. Adjusting your optimizer parameters (or better, using an adaptive method like Adam) might help out here.

Of course, with all that said, this is a fairly large difference, so I'd double check your results before blaming it on the underlying math package or bad luck.



来源:https://stackoverflow.com/questions/43221730/tensorflow-same-code-but-get-different-result-from-cpu-device-to-gpu-device

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!