I\'m trying to extract email addresses from plain text transcripts of emails. I\'ve cobbled together a bit of code to find the addresses themselves, but I don\'t know how to mak
Use the email and mailbox packages to parse the plain text version of the email. This will convert it to an object that will enable to extract all the addresses in the 'From' field.
You can also do a lot of other analysis on the message, if you need to process other header fields, or the message body.
As a quick example, the following (untested) code should read all the message in a unix style mailbox, and print all the 'from' headers.
import mailbox
import email
mbox = mailbox.PortableUnixMailbox(open(filename, 'rU'), email.message_from_file)
for msg in mbox:
from = msg['From']
print from