I\'m working on an application that lets registered users create or upload content, and allows anonymous users to view that content and browse registered users\' pages to find t
I think you should distinguish between two types of users:
1) users that have logged in via Google Accounts or that have already registered on your site with a non-google e-mail address
2) users that opened your site for the first time and are not logged in in any way
For the second case, I can see no other way than to generate some random string (e.g. via uuid.uuid4()
or from this user's session cookie key), as an anonymous user does not carry any unique information with himself.
For users that are logged in, however, you already have a unique identifier -- their e-mail address. I agree with your privacy concerns -- you shouldn't use it as an identifier. Instead, how about generating a string that seems random, but is in fact generated from the e-mail address? Hashing functions are perfect for this purpose. Example:
>>> import hashlib
>>> email = 'user@host.com'
>>> salt = 'SomeLongStringThatWillBeAppendedToEachEmail'
>>> key = hashlib.sha1('%s$%s' % (email, salt)).hexdigest()
>>> print key
f6cd3459f9a39c97635c652884b3e328f05be0f7
As hashlib.sha1
is not a random function, but for given data returns always the same result, but it is proven to be practically irreversible, you can safely present the hashed key on the website without compromising user's e-mail address. Also, you can safely assume that no two hashes of distinct e-mails will be the same (they can be, but probability of it happening is very, very small). For more information on hashing functions, consult the Wikipedia entry.