I have a dataset that is around 190GB that was partitioned into 1000 partitions.
my EMR cluster allows a maximum of 10 r5a.2xlarge TASK nodes and 2 CORE node
r5a.2xlarge