Create a custom federated data set in TensorFlow Federated

问题

I'd like to adapt the recurrent autoencoder from this blog post to work in a federated environment.

I've modified the model slightly to conform with the example shown in the TFF image classification tutorial.

def create_compiled_keras_model():
  model = tf.keras.models.Sequential([
      tf.keras.layers.LSTM(2, input_shape=(10, 2), name='Encoder'),
      tf.keras.layers.RepeatVector(10, name='Latent'),
      tf.keras.layers.LSTM(2, return_sequences=True, name='Decoder')]
  )

  model.compile(loss='mse', optimizer='adam')
  return model

model = create_compiled_keras_model()

sample_batch = gen(1)
timesteps, input_dim = 10, 2

def model_fn():
  keras_model = create_compiled_keras_model()
  return tff.learning.from_compiled_keras_model(keras_model, sample_batch)

The gen function is defined as follows:

import random

def gen(batch_size):
    seq_length = 10

    batch_x = []
    batch_y = []

    for _ in range(batch_size):
        rand = random.random() * 2 * np.pi

        sig1 = np.sin(np.linspace(0.0 * np.pi + rand, 3.0 * np.pi + rand, seq_length * 2))
        sig2 = np.cos(np.linspace(0.0 * np.pi + rand, 3.0 * np.pi + rand, seq_length * 2))

        x1 = sig1[:seq_length]
        y1 = sig1[seq_length:]
        x2 = sig2[:seq_length]
        y2 = sig2[seq_length:]

        x_ = np.array([x1, x2])
        y_ = np.array([y1, y2])
        x_, y_ = x_.T, y_.T

        batch_x.append(x_)
        batch_y.append(y_)

    batch_x = np.array(batch_x)
    batch_y = np.array(batch_y)

    return batch_x, batch_x #batch_y

So far I've been unable to find any documentation which does not use sample data from the TFF repository.

How can I modify this to create a federated data set and begin training?

回答1:

At a very high-level, to use an arbitrary dataset with TFF the following steps are needed:

Partition the dataset into per client subsets (how to do so is a much larger question)
Create a tf.data.Dataset per client subset
Pass a list of all (or a subset) of the Dataset objects to the federated optimization.

What is happening in the tutorial

The Federated Learning for Image Classification tutorial uses tff.learning.build_federated_averaging_process to build up a federated optimization using the FedAvg algorithm.

In that notebook, the following code is executing one round of federated optimization, where the client datasets are passed to the process' .next method:

   state, metrics = iterative_process.next(state, federated_train_data)

Here federated_train_data is a Python list of tf.data.Dataset, one per client participating in the round.

The ClientData object

The canned datasets provided by TFF (under tff.simulation.datasets) are implemented using the tff.simulation.ClientData interface, which manages the client → dataset mapping and tff.data.Dataset creation.

If you're planning to re-use a dataset, implementing it as a tff.simulation.ClientData may make future use easier.

来源：https://stackoverflow.com/questions/55434004/create-a-custom-federated-data-set-in-tensorflow-federated

标签

python-3.x

tensorflow

tensorflow-federated