I have around 1.5 TB of data divided into around 5500 json files, that I need to process (NN search) using map_partition and save the results. (GCS). Each .json file has size b