TFF: How define tff.simulation.ClientData.from_clients_and_fn Function?

十年热恋 提交于 2021-02-05 08:29:05

问题


In the federated learning context, One such classmethod that should work would be tff.simulation.ClientData.from_clients_and_fn. Here, if I pass a list of client_ids and a function which returns the appropriate dataset when given a client id, you will have your hands on a fully functional ClientData.

I think here, an approach for defining the function I may use is to construct a Python dict which maps client IDs to tf.data.Dataset objects--you could then define a function which takes a client id, looks up the dataset in the dict, and returns the dataset. So I define function as below but I think it is wrong, what do you think?

list = ["0","1","2"]
tab = {"0":ds, "1":ds, "2":ds}
def create_tf_dataset_for_client_fn(id):
    return ds

source = tff.simulation.ClientData.from_clients_and_fn(list, create_tf_dataset_for_client_fn) 

I suppose here that the 4 clients have the same dataset :'ds'


回答1:


Creating a dict of (client_id, dataset) key-value pairs is a reasonable way to set up a tff.simulation.ClientData. Indeed, the code in the question will result in all clients have the same dataset since ds is return for all values of parameter id. One thing to watch out in pre-constructing a dict of datasets is that it may require loading the entire contents of the data into memory (may fail for large datasets).

Alternatively, constructing the dataset on-demand could reduce memory usage. One example might be to have a dict of (client_id, file path) key-value pairs. Something like:

dataset_paths = {
  'client_0': '/tmp/A.txt',
  'client_1': '/tmp/B.txt',
  'client_2': '/tmp/C.txt',
}

def create_tf_dataset_for_client_fn(id):
   path = dataset_paths.get(id)
   if path is None:
     raise ValueError(f'No dataset for client {id}')
   return tf.data.Dataset.TextLineDataset(path)

source = tff.simulation.ClientData.from_clients_and_fn(
  dataset_paths.keys(), create_tf_dataset_for_client_fn)

This is similar to the approach used in tff.simulation.FilePerUserClientData. It may be useful to look at the code of that class as an example.



来源:https://stackoverflow.com/questions/60265798/tff-how-define-tff-simulation-clientdata-from-clients-and-fn-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!