I have a pySpark dataframe that looks like this:
+-------------+----------+ | sku| date| +-------------+----------+ |MLA-603526656|02/09/2016|
You cannot use dict. Use:
>>> from pyspark.sql import functions as F >>> >>> df_testing.groupBy('sku').agg(F.min('date'), F.max('date'))