I am trying to aggregate a dataframe based on 1-gram (can be extended to n-gram by changing n in the code below) frequency and associate other columns to it. The way I did i