Parsing emails as soon as they are received

北城余情 提交于 2019-12-08 13:04:27

问题


I have users sending emails with some text I need to extract. Each user's email is mapped to a single mailbox. I'm currently using a cron job that polls the mailbox (postfix) every 5 minutes, checks for new messages, and sends it to a queue where I have workers parse them. I have two main questions:

  1. Is there a way I can parse the email as soon as it's received instead of polling the server? Also, how could I implement this to be scalable? For example, if there are 50 incoming messages per second.
  2. I'm programatically writing each user's email address to point to mailbox in the postfix configuration file. Would it be better to create a catch all account, so I don't have to write each email address? However, I know catch-all accounts are more susceptible to spam.

回答1:


Use a pipe alias to catch the email, then use celery to dump it into a MQ for processing.




回答2:


Yes, this can be done quite easily. All you need to do is configure the postfix to forward email to a script instead of to a mailbox. It does not really have to be a catch-all, you can configure postfix to forward specific emails to a script. The script can be written in any language. I wrote such script in php a couple of times. Another possibility for a very busy server, like 50 emails per second is to write your own filter server, then configure postfix to pass each message to your filter.

TO forward email to a script, in aliases file put a line like this: the path must point to this file

someaccount |/usr/local/bin/emailParser.php

To forward emails to a filter, it has to be configured in master.cf, a little more difficult.




回答3:


  1. I would recommend using Procmail for this. It is specifically designed to process your incoming mail and you can pass all mail with a certain property to your app.

    http://www.procmail.org/

  2. The spam problem with catchall addresses can usually be solved quite easily by monitoring all mail on the machine. If multiple addresses recieve the same mail, than there's a high probability that it's spam.



来源:https://stackoverflow.com/questions/4529075/parsing-emails-as-soon-as-they-are-received

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!