Unexpected datasize issue with snappy compressed parquet files

后端 未结 0 1916
名媛妹妹
名媛妹妹 2021-01-16 10:46
df1 - large dataset
df2 = df1.sample(tiny_fraction)

df1 is written to disk as a parquet with snappy compression (~75GB)
df2 is written to disk as a parquet with sna         


        
相关标签:
回答
  • 消灭零回复
提交回复
热议问题