问题
I have users sending emails with some text I need to extract. Each user's email is mapped to a single mailbox. I'm currently using a cron job that polls the mailbox (postfix) every 5 minutes, checks for new messages, and sends it to a queue where I have workers parse them. I have two main questions:
- Is there a way I can parse the email as soon as it's received instead of polling the server? Also, how could I implement this to be scalable? For example, if there are 50 incoming messages per second.
- I'm programatically writing each user's email address to point to mailbox in the postfix configuration file. Would it be better to create a catch all account, so I don't have to write each email address? However, I know catch-all accounts are more susceptible to spam.
回答1:
Use a pipe alias to catch the email, then use celery to dump it into a MQ for processing.
回答2:
Yes, this can be done quite easily. All you need to do is configure the postfix to forward email to a script instead of to a mailbox. It does not really have to be a catch-all, you can configure postfix to forward specific emails to a script. The script can be written in any language. I wrote such script in php a couple of times. Another possibility for a very busy server, like 50 emails per second is to write your own filter server, then configure postfix to pass each message to your filter.
TO forward email to a script, in aliases file put a line like this: the path must point to this file
someaccount |/usr/local/bin/emailParser.php
To forward emails to a filter, it has to be configured in master.cf, a little more difficult.
回答3:
I would recommend using Procmail for this. It is specifically designed to process your incoming mail and you can pass all mail with a certain property to your app.
http://www.procmail.org/
The spam problem with catchall addresses can usually be solved quite easily by monitoring all mail on the machine. If multiple addresses recieve the same mail, than there's a high probability that it's spam.
来源:https://stackoverflow.com/questions/4529075/parsing-emails-as-soon-as-they-are-received