Could someone help me understand why the map type in pyspark could contain duplicate keys?
A minimum example would be:
# assume the input data frame is