I wrote this code to get rid of the duplicates in a large (800000) tweets csv file but, when I run it, the file I get is larger than the original one: Original is 1,580,307