Parsing “From” addresses from email text

前端 未结 8 2129
春和景丽
春和景丽 2021-02-19 03:56

I\'m trying to extract email addresses from plain text transcripts of emails. I\'ve cobbled together a bit of code to find the addresses themselves, but I don\'t know how to mak

8条回答
  •  栀梦
    栀梦 (楼主)
    2021-02-19 04:35

    Use the email and mailbox packages to parse the plain text version of the email. This will convert it to an object that will enable to extract all the addresses in the 'From' field.

    You can also do a lot of other analysis on the message, if you need to process other header fields, or the message body.

    As a quick example, the following (untested) code should read all the message in a unix style mailbox, and print all the 'from' headers.

    import mailbox
    import email
    
    mbox = mailbox.PortableUnixMailbox(open(filename, 'rU'), email.message_from_file)
    
    for msg in mbox:
       from = msg['From']
       print from
    

提交回复
热议问题