问题
I am writing a tensorflow.Keras wrapper to perform ML experiments.
I need my framework to be able to perform an experiment as specified in a configuration yaml file and run in parallel in a GPU.
Then I need a guarantee that if I ran the experiment again I would get if not the exact same results something reasonably close.
To try to ensure this, my training script contains these lines at the beginning, following the guidelines in the official documentation:
# Set up random seeds
random.seed(seed)
np.random.seed(seed)
tf.set_random_seed(seed)
This has proven to not be enough.
I ran the same configuration 4 times, and plotted the results:
As you can see, results vary a lot between runs.
How can I set up a training session in Keras to ensure I get reasonably similar results when training in a GPU? Is this even possible?
The full training script can be found here.
Some of my colleagues are using just pure TF, and their results seem far more consistent. What is more, they do not seem to be seeding any randomness except to ensure that the train and validation split is always the same.
回答1:
Keras + Tensorflow.
Step 1, disable GPU.
import os
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = ""
Step 2, seed those libraries which are included in your code, say "tensorflow, numpy, random".
import tensorflow as tf
import numpy as np
import random as rn
sd = 1 # Here sd means seed.
np.random.seed(sd)
rn.seed(sd)
os.environ['PYTHONHASHSEED']=str(sd)
from keras import backend as K
config = tf.ConfigProto(intra_op_parallelism_threads=1,inter_op_parallelism_threads=1)
tf.set_random_seed(sd)
sess = tf.Session(graph=tf.get_default_graph(), config=config)
K.set_session(sess)
Make sure these two pieces of code are included at the start of your code, then the result will be reproducible.
回答2:
Try adding seed parameters to weights/biases initializers. Just to add more specifics to Alexander Ejbekov's comment.
Tensorflow has two random seeds graph level and op level. If you're using more than one graph, you need to specify seed in every one. You can override graph level seed with op level, by setting seed parameter within function. And you can make two functions even from different graphs output same value if same seed is set. Consider this example:
g1 = tf.Graph()
with g1.as_default():
tf.set_random_seed(1)
a = tf.get_variable('a', shape=(1,), initializer=tf.keras.initializers.glorot_normal())
b = tf.get_variable('b', shape=(1,), initializer=tf.keras.initializers.glorot_normal(seed=2))
with tf.Session(graph=g1) as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(a))
print(sess.run(b))
g2 = tf.Graph()
with g2.as_default():
a1 = tf.get_variable('a1', shape=(1,), initializer=tf.keras.initializers.glorot_normal(seed=1))
with tf.Session(graph=g2) as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(a1))
In this example, output of a
is the same as a1
, but b
is different.
来源:https://stackoverflow.com/questions/55200768/structuring-a-keras-project-to-achieve-reproducible-results-in-gpu