I have a huge csv dataset and want to make a federated learning over it. I have two questions, first: do I need to do the preprocessing before federated learning phase? and