The Python standard library comes with an e-mail parsing function: email.utils.parseaddr().
It returns a two-tuple containing the real name and the actual address parts of the e-mail:
>>> from email.utils import parseaddr
>>> parseaddr('foo@example.com')
('', 'foo@example.com')
>>> parseaddr('Full Name <full@example.com>')
('Full Name', 'full@example.com')
>>> parseaddr('"Full Name with quotes and <weird@chars.com>" <weird@example.com>')
('Full Name with quotes and <weird@chars.com>', 'weird@example.com')
And if the parsing is unsuccessful, it returns a two-tuple of empty strings:
>>> parseaddr('[invalid!email]')
('', '')
An issue with this parser is that it's accepting of anything that is considered as a valid e-mail address for RFC-822 and friends, including many things that are clearly not addressable on the wide Internet:
>>> parseaddr('invalid@example,com') # notice the comma
('', 'invalid@example')
>>> parseaddr('invalid-email')
('', 'invalid-email')
So, as @TokenMacGuy put it, the only definitive way of checking an e-mail address is to send an e-mail to the expected address and wait for the user to act on the information inside the message.
However, you might want to check for, at least, the presence of an @-sign on the second tuple element, as @bvukelic suggests:
>>> '@' in parseaddr("invalid-email")[1]
False
If you want to go a step further, you can install the dnspython project and resolve the mail servers for the e-mail domain (the part after the '@'), only trying to send an e-mail if there are actual MX
servers:
>>> from dns.resolver import query
>>> domain = 'foo@bar@google.com'.rsplit('@', 1)[-1]
>>> bool(query(domain, 'MX'))
True
>>> query('example.com', 'MX')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
[...]
dns.resolver.NoAnswer
>>> query('not-a-domain', 'MX')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
[...]
dns.resolver.NXDOMAIN
You can catch both NoAnswer
and NXDOMAIN
by catching dns.exception.DNSException
.
And Yes, foo@bar@google.com
is a syntactically valid address. Only the last @
should be considered for detecting where the domain part starts.