Method for parsing text Cc field of email header?

烂漫一生 提交于 2019-12-04 08:25:14
Martin Tournoij

There are a bunch of function available as a standard python module, but I think you're looking for email.utils.parseaddr() or email.utils.getaddresses()

>>> addresses = 'friend@email.com, John Smith <john.smith@email.com>,"Smith, Jane" <jane.smith@uconn.edu>'
>>> email.utils.getaddresses([addresses])
[('', 'friend@email.com'), ('John Smith', 'john.smith@email.com'), ('Smith, Jane', 'jane.smith@uconn.edu')]

I haven't used it myself, but it looks to me like you could use the csv package quite easily to parse the data.

The bellow is completely unnecessary. I wrote it before realising that you could pass getaddresses() a list containing a single string containing multiple addresses.

I haven't had a chance to look at the specifications for addresses in email headers, but based on the string you provided, this code should do the job splitting it into a list, making sure to ignore commas if they are within quotes (and therefore part of a name).

from email.utils import getaddresses

addrstring = ',friend@email.com, John Smith <john.smith@email.com>,"Smith, Jane" <jane.smith@uconn.edu>,'

def addrparser(addrstring):
    addrlist = ['']
    quoted = False

    # ignore comma at beginning or end
    addrstring = addrstring.strip(',')

    for char in addrstring:
        if char == '"':
            # toggle quoted mode
            quoted = not quoted
            addrlist[-1] += char
        # a comma outside of quotes means a new address
        elif char == ',' and not quoted:
            addrlist.append('')
        # anything else is the next letter of the current address
        else:
            addrlist[-1] += char

    return getaddresses(addrlist)

print addrparser(addrstring)

Gives:

[('', 'friend@email.com'), ('John Smith', 'john.smith@email.com'),
 ('Smith, Jane', 'jane.smith@uconn.edu')]

I'd be interested to see how other people would go about this problem!

Gaurav Panchal

Convert multiple E-mail string in to dictionary (Multiple E-Mail with name in to one string).

emailstring = 'Friends <friend@email.com>, John Smith <john.smith@email.com>,"Smith" <jane.smith@uconn.edu>'

Split string by Comma

email_list = emailstring.split(',')

name is key and email is value and make dictionary.

email_dict = dict(map(lambda x: email.utils.parseaddr(x), email_list))

Result like this:

{'John Smith': 'john.smith@email.com', 'Friends': 'friend@email.com', 'Smith': 'jane.smith@uconn.edu'}

Note:

If there is same name with different email id then one record is skip.

'Friends <friend@email.com>, John Smith <john.smith@email.com>,"Smith" <jane.smith@uconn.edu>, Friends <friend_co@email.com>'

"Friends" is duplicate 2 time.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!