I have a pyspark project on AWS EMR that reads and writes data to AWS S3.
I have a pipeline that runs monthly, so usually I overwrite directories like so: