I have a PySpark code whose last step is to write data to S3 in parquet format. It looks something like this
df = generated_by_some_logic df.cache() df.count() df