There's no easy solution for this kind of problem. Especially if your list is really large (millions of items). Maybe those two papers can point you in the right direction:
http://www.cs.utexas.edu/users/ml/papers/normalization-icdm-05.pdf
http://www.ismll.uni-hildesheim.de/pub/pdfs/Rendle_SchmidtThieme2006-Object_Identification_with_Constraints.pdf