Normally I anonymize my data by using hashlib and using the .apply(hash) function.
Now im trying a new approach, imagine I have to following df called \'data\':
labels, uniques = pd.factorize(df['name']) labels = ['person_'+str(l) for l in labels] df['contributor_anonymized'] = labels