I have a pyspark dataframe similar to below:
order_id item qty 123 abc 1 123 abc1 4 234 abc2 5 234 abc3 2 2