Is there a way (or any kind of hack) to read input data from compressed files? My input consists of a few hundreds of files, which are produced as compressed with gzip and u
The slower performance with my work around was most likely because Dataflow was putting most of the files in the same split so they weren't being processed in parallel. You can try the following to speed things up.