How to efficiently write multiple pyarrow tables (>1,000 tables) to a partitioned parquet dataset?

后端未结

关注

 0  734

I have some big files (around 7,000 in total, 4GB each) in other formats that I want to store into a partitioned (hive) directory using the pyarrow.parquet.write_to_dataset() fo