Parsing “From” addresses from email text

前端 未结 8 2081
春和景丽
春和景丽 2021-02-19 03:56

I\'m trying to extract email addresses from plain text transcripts of emails. I\'ve cobbled together a bit of code to find the addresses themselves, but I don\'t know how to mak

8条回答
  •  时光说笑
    2021-02-19 04:29

    Roughly speaking, you can:

    from email.utils import parseaddr
    
    foundemail = []
    for line in open("text.txt"):
        if not line.startswith("From:"): continue
        n, e = parseaddr(line)
        foundemail.append(e)
    print foundemail
    

    This utilizes the built-in python parseaddr function to parse the address out of the from line (as demonstrated by other answers), without the overhead necessarily of parsing the entire message (e.g. by using the more full featured email and mailbox packages). The script here simply skips any lines that do not begin with "From:". Whether the overhead matters to you depends on how big your input is and how often you will be doing this operation.

提交回复
热议问题