Unexpected behaviour of pyspark `samplingRatio` while reading csv

前端未结

关注

 0  1232

I want to read a billions-of-rows csv file while also inferring the schema:

df = spark.read.csv(\'s3://bucket/data/*\', inferSchema=True, samplingRatio=0.0001


                      
              相关标签: