I have a pandas dataframe of 1.3 million rows and a set of columns such as Phone1 (Phone numbers), Sale_date (2015 to 2020), Product_description (185 unique product descript