aws-glue-spark

How to configure Spark / Glue to avoid creation of empty $_folder_$ after Glue job successful execution

ⅰ亾dé卋堺 提交于 2021-01-24 13:47:41
问题 I have a simple glue etl job which is triggered by Glue workflow. It drop duplicates data from a crawler table and writes back the result into a S3 bucket. The job is completed successfully . However the empty folders that spark generates "$ folder $" remain in s3. It does not look nice in the hierarchy and causes confusion. Is there any way to configure spark or glue context to hide/remove these folders after successful completion of the job? ---------------------S3 image -------------------