I am working with an enormous dataset (hundreds of GBs) that has ~40 million identifiers stored as 32-character strings, and for each identifier hundreds or thousands of row