收取邮件有两种方式,一种是POP3, 另一种是IMAP,它们都是收取邮件服务器支持的协议,我们用foxmail进行邮件的收发,感觉不到收发的流程,而实际上收和发是作用在不同的服务器上,发邮件有专门的发邮件服务器,收邮件也有专门的收邮件服务器,发邮件只负责发送不管收取,同时收取邮件也不管如何发邮件,因此在测试时收和发邮件是分开进行的,虽然大多数时候收发邮件服务是装在一个服务器上,但测试测的是协议,如SMTP, 如POP3, IMAP,python中的poplib收取邮件还是非常简单的,重点是收来的邮件需要解析,因为SMTP是进行编码过的,收来的邮件需要进行处理后才能被我们阅读,因此又要用到email模块,SMTP用email来传递内容,POP3用email来解析内容
poplib
#返回所有邮件的编号
list(self,which=None):
['response',['message_count, octets'],octets]/[scan listing for the message]
-----------------------------
('+OK 7 messages:', ['1 1080', '2 1080', '3 1079', '4 675265', '5 675506', '6 675534', '7 597'], 61)
#收取整封邮件,索引号必需从1开始
retr(self,which):
return whole message of number which
#身份认证
user(self,user)
pass_(self.pwd)
#显示调试信息
set_debuglevel(self,level)
#返回邮件数量和邮件大小
stat(self)
get mailbox size
return(mail_counter, mailbox_size)
-------------------------------------------
(7, 2030141)
#显示邮件的头信息,以及定制正文数据
top(self,which,howmuch)
return head of message of which, and how much lines of body message
原邮件如下:
26169 From hding@hding.com Tue Aug 16 20:06:02 2016
26170 Return-Path: <hding@hding.com>
26171 Received: from hding.com ([192.168.10.3])
26172 by ding.com (8.13.8/8.13.8) with ESMTP id u7GC623I002429
26173 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO)
26174 for <qa@ding.com>; Tue, 16 Aug 2016 20:06:02 +0800
26175 Received: from 10.8.116.6 ([10.8.116.6])
26176 (authenticated bits=0)
26177 by hding.com (8.13.8/8.13.8) with ESMTP id u7GC0v9x027721
26178 for qa@ding.com; Tue, 16 Aug 2016 20:05:13 +0800
26179 Date: Tue, 16 Aug 2016 20:00:57 +0800
26180 From: hding@hding.com
26181 Message-Id: <201608161205.u7GC0v9x027721@hding.com>
26182 X-UID: 71
26183 Status: O
26184
26185 "hello world"
26186 I am terry
26187 please welcome me
top(7,1)函数返回的第7封邮件的头信息,1行正文,是一个元组
('+OK', ['Return-Path: <hding@hding.com>', 'Received: from hding.com ([192.168.10.3])', '\tby ding.com (8.13.8/8.13.8) with ESMTP id u7GC623I002429', '\t(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO)', '\tfor <qa@ding.com>; Tue, 16 Aug 2016 20:06:02 +0800', 'Received: from 10.8.116.6 ([10.8.116.6])', '\t(authenticated bits=0)', '\tby hding.com (8.13.8/8.13.8) with ESMTP id u7GC0v9x027721', '\tfor qa@ding.com; Tue, 16 Aug 2016 20:05:13 +0800', 'Date: Tue, 16 Aug 2016 20:00:57 +0800', 'From: hding@hding.com', 'Message-Id: <201608161205.u7GC0v9x027721@hding.com>', '', '"hello world"'], 566)
retr(7) 函数返回整封邮件,是一元组,内容在retr(7)[1]
('+OK 597 octets', ['Return-Path: <hding@hding.com>', 'Received: from hding.com ([192.168.10.3])', '\tby ding.com (8.13.8/8.13.8) with ESMTP id u7GC623I002429', '\t(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO)', '\tfor <qa@ding.com>; Tue, 16 Aug 2016 20:06:02 +0800', 'Received: from 10.8.116.6 ([10.8.116.6])', '\t(authenticated bits=0)', '\tby hding.com (8.13.8/8.13.8) with ESMTP id u7GC0v9x027721', '\tfor qa@ding.com; Tue, 16 Aug 2016 20:05:13 +0800', 'Date: Tue, 16 Aug 2016 20:00:57 +0800', 'From: hding@hding.com', 'Message-Id: <201608161205.u7GC0v9x027721@hding.com>', '', '"hello world"', 'I am terry', 'please welcome me'], 597)
只取正文信息只需要把retr得到的全部信息减掉头信息即可
head = pop.top(7,0)
message = pop.retr(7)
body = [line for line in message[1] if line not in head[1]]
如果邮件有附件如何处理
retr()收到的邮件是一个多字段的列表,还谈不上是邮件,需要通过mail.parser去解析,解析出来的邮件将符合邮件的格式,如Mail From, Mail To, 等
获取第7封开始的邮件
messages = [pop_conn.retr(i) for i in range(7, pop_conn.stat()[0]+1)]
--------------------------------------------------------------------------
[('+OK 597 octets', ['Return-Path: <hding@hding.com>', 'Received: from hding.com ([192.168.10.3])', '\tby ding.com (8.13.8/8.13.8) with ESMTP id u7GC623I002429', '\t(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO)', '\tfor <qa@ding.com>; Tue, 16 Aug 2016 20:06:02 +0800', 'Received: from 10.8.116.6 ([10.8.116.6])', '\t(authenticated bits=0)', '\tby hding.com (8.13.8/8.13.8) with ESMTP id u7GC0v9x027721', '\tfor qa@ding.com; Tue, 16 Aug 2016 20:05:13 +0800', 'Date: Tue, 16 Aug 2016 20:00:57 +0800', 'From: hding@hding.com', 'Message-Id: <201608161205.u7GC0v9x027721@hding.com>', '', '"hello world"', 'I am terry', 'please welcome me'], 597)]
给每封邮件中的内容以'\n'字符作为连接符形成字符串
messages = ["\n".join(msg[1]) for msg in messages]
---------------------------------------------------------------------------
['Return-Path: <hding@hding.com>\nReceived: from hding.com ([192.168.10.3])\n\tby ding.com (8.13.8/8.13.8) with ESMTP id u7GC623I002429\n\t(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO)\n\tfor <qa@ding.com>; Tue, 16 Aug 2016 20:06:02 +0800\nReceived: from 10.8.116.6 ([10.8.116.6])\n\t(authenticated bits=0)\n\tby hding.com (8.13.8/8.13.8) with ESMTP id u7GC0v9x027721\n\tfor qa@ding.com; Tue, 16 Aug 2016 20:05:13 +0800\nDate: Tue, 16 Aug 2016 20:00:57 +0800\nFrom: hding@hding.com\nMessage-Id: <201608161205.u7GC0v9x027721@hding.com>\n\n"hello world"\nI am terry\nplease welcome me']
解析文件内容
messages = [parser.Parser().parsestr(msg) for msg in messages]
------------------------------------------------------------------------------
[<email.message.Message instance at 0x02C1C9B8>] 返回了一个message的实例
获取单封邮件message内容
for message in messages:
print message
--------------------------------------------------------------------------------
From nobody Wed Aug 17 19:04:31 2016
Return-Path: <hding@hding.com>
Received: from hding.com ([192.168.10.3])
by ding.com (8.13.8/8.13.8) with ESMTP id u7GC623I002429
(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO)
for <qa@ding.com>; Tue, 16 Aug 2016 20:06:02 +0800
Received: from 10.8.116.6 ([10.8.116.6]) (authenticated bits=0)
by hding.com (8.13.8/8.13.8) with ESMTP id u7GC0v9x027721
for qa@ding.com; Tue, 16 Aug 2016 20:05:13 +0800
Date: Tue, 16 Aug 2016 20:00:57 +0800
From: hding@hding.com
Message-Id: <201608161205.u7GC0v9x027721@hding.com>
"hello world"
I am terry
please welcome me
----------------------------------------------------------------------------------
从上文可以看出解析的非常好,如何区别附件和正文
for part in message.walk(): #遍历邮件内容
fileName = part.get_filename() #得到附件名
contentType = part.get_content_type() #得到附件类型
# 保存附件
if fileName: #如果有附件则一定会有文件名 #附件重新写到新的文件中
data = part.get_payload(decode=True)
f_attach = open(fileName, 'wb')
f_attach.write(data)
f_attach.close()
elif contentType == 'text/plain' or contentType == 'text/html': #正文照抄到正文中
#保存正文
data = part.get_payload(decode=True)
print data
----------------------------------------------------------------------------------
"hello world"
I am terry
please welcome me
带附件的邮件
26229 From hding@hding.com Wed Aug 17 19:21:08 2016 fist part
26230 Return-Path: <hding@hding.com>
26231 Received: from hding.com ([192.168.10.3])
26232 by ding.com (8.13.8/8.13.8) with ESMTP id u7HBL8rW015601
26233 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO)
26234 for <qa@ding.com>; Wed, 17 Aug 2016 19:21:08 +0800
26235 Received: from ding.com ([10.10.10.3])
26236 (authenticated bits=0)
26237 by hding.com (8.13.8/8.13.8) with ESMTP id u7HBL7sl008349
26238 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO)
26239 for <qa@ding.com>; Wed, 17 Aug 2016 19:21:07 +0800
26240 Date: Wed, 17 Aug 2016 19:21:07 +0800
26241 Message-Id: <201608171121.u7HBL7sl008349@hding.com>
26242 Content-Type: multipart/mixed; boundary="===============4712348666551551578=="
26243 MIME-Version: 1.0
26244 From: hding@hding.com
26245 To: qa@ding.com
26246 Subject: test_mail
26247 X-UID: 73
26248 Status: RO
26249
26250 --===============4712348666551551578== second part
26251 Content-Type: doc/test_file
26252 MIME-Version: 1.0
26253 Content-Disposition: attachment; filename="test_file"
26254 Content-Transfer-Encoding: base64
26255
26256 aGVsbG8gd29ybGQKCkkgYW0gYSB0ZXN0IGZpbGUsIGNhbiB5b3UgcmVhZCBpdCAKCmFuZCBnaXZl
26257 IG1lIGEgcmVzcG9uc2UK
26258
26259 --===============4712348666551551578== third part
26260 Content-Type: text/plain; charset="utf-8"
26261 MIME-Version: 1.0
26262 Content-Transfer-Encoding: base64
26263
26264 aGVsbG8gd29ybGQKCkkgYW0gYSB0ZXN0IGZpbGUsIGNhbiB5b3UgcmVhZCBpdCAKCmFuZCBnaXZl
26265 IG1lIGEgcmVzcG9uc2UKPGltYWdlIHNyYz0nY2lkOjEnPg==
26266
26267 --===============4712348666551551578==--
在工作中,由于我只需要在pop3上进行收文件即可,无需真实下载下来,因此只需要retr函数就完成任务
1 #!/usr/bin/env python
2 #coding:utf-8
3
4 from poplib import POP3_SSL
5
6 class DPI_SSL_POP3(object):
7
8 def __init__(self,username='username',password='password',host='192.168.10.3'):
9 self.pop = POP3_SSL(host)
10 self.pop.user(username)
11 self.pop.pass_(password)
12
13 def get_message_from_pop3s(self):
14 self.pop.retr(self.pop.stat()[0]) #最新一封邮件
15
16
17 if __name__ == '__main__':
18
19 pops = DPI_SSL_POP3()
20 pops.get_message_from_pop3s()
pop3 server在WAN侧, 我在LAN侧执行脚本,收取带有病毒的附件,病毒经过Firewall,进行检测
来源:oschina
链接:https://my.oschina.net/u/2303535/blog/734266