I\'m trying to extract email addresses from plain text transcripts of emails. I\'ve cobbled together a bit of code to find the addresses themselves, but I don\'t know how to mak
Roughly speaking, you can:
from email.utils import parseaddr
foundemail = []
for line in open("text.txt"):
if not line.startswith("From:"): continue
n, e = parseaddr(line)
foundemail.append(e)
print foundemail
This utilizes the built-in python parseaddr function to parse the address out of the from line (as demonstrated by other answers), without the overhead necessarily of parsing the entire message (e.g. by using the more full featured email and mailbox packages). The script here simply skips any lines that do not begin with "From:". Whether the overhead matters to you depends on how big your input is and how often you will be doing this operation.