问题
I'd like to adapt the recurrent autoencoder from this blog post to work in a federated environment.
I've modified the model slightly to conform with the example shown in the TFF image classification tutorial.
def create_compiled_keras_model():
model = tf.keras.models.Sequential([
tf.keras.layers.LSTM(2, input_shape=(10, 2), name='Encoder'),
tf.keras.layers.RepeatVector(10, name='Latent'),
tf.keras.layers.LSTM(2, return_sequences=True, name='Decoder')]
)
model.compile(loss='mse', optimizer='adam')
return model
model = create_compiled_keras_model()
sample_batch = gen(1)
timesteps, input_dim = 10, 2
def model_fn():
keras_model = create_compiled_keras_model()
return tff.learning.from_compiled_keras_model(keras_model, sample_batch)
The gen function is defined as follows:
import random
def gen(batch_size):
seq_length = 10
batch_x = []
batch_y = []
for _ in range(batch_size):
rand = random.random() * 2 * np.pi
sig1 = np.sin(np.linspace(0.0 * np.pi + rand, 3.0 * np.pi + rand, seq_length * 2))
sig2 = np.cos(np.linspace(0.0 * np.pi + rand, 3.0 * np.pi + rand, seq_length * 2))
x1 = sig1[:seq_length]
y1 = sig1[seq_length:]
x2 = sig2[:seq_length]
y2 = sig2[seq_length:]
x_ = np.array([x1, x2])
y_ = np.array([y1, y2])
x_, y_ = x_.T, y_.T
batch_x.append(x_)
batch_y.append(y_)
batch_x = np.array(batch_x)
batch_y = np.array(batch_y)
return batch_x, batch_x #batch_y
So far I've been unable to find any documentation which does not use sample data from the TFF repository.
How can I modify this to create a federated data set and begin training?
回答1:
At a very high-level, to use an arbitrary dataset with TFF the following steps are needed:
- Partition the dataset into per client subsets (how to do so is a much larger question)
- Create a tf.data.Dataset per client subset
- Pass a list of all (or a subset) of the Dataset objects to the federated optimization.
What is happening in the tutorial
The Federated Learning for Image Classification tutorial uses tff.learning.build_federated_averaging_process to build up a federated optimization using the FedAvg algorithm.
In that notebook, the following code is executing one round of federated optimization, where the client datasets are passed to the process' .next
method:
state, metrics = iterative_process.next(state, federated_train_data)
Here federated_train_data
is a Python list
of tf.data.Dataset
, one per client participating in the round.
The ClientData object
The canned datasets provided by TFF (under tff.simulation.datasets) are implemented using the tff.simulation.ClientData interface, which manages the client → dataset mapping and tff.data.Dataset
creation.
If you're planning to re-use a dataset, implementing it as a tff.simulation.ClientData
may make future use easier.
来源:https://stackoverflow.com/questions/55434004/create-a-custom-federated-data-set-in-tensorflow-federated