问题
I am developing an android messaging application. Is there a good spam filtering algorithm that works well for SMS? Please give some things to kick start.
Rahim.
回答1:
I don't think there is a set algorithm through which you can definitely know whether or not user considers an SMS to be spam, (an ad in SMS can be important to some users and spam to others) what you can do however is what Google does to identify spam mail.
You could allow the user to mark an SMS as spam or not spam and then based on what content has been marked as spam by the user, you can decide whether the user considers it spam or not.
Edit: still closest to what you are looking for I found in this pdf on Content Based SMS Spam Filtering.
It's not an algorithm but rather things you should keep in mind.
Quoting from the pdf:
The most popular techniques used to reduce spam nowadays include the following ones.
White and black listing. The senders occurring in a black list (e.g. RBL) are considered spammers, and their messages blocked. The messages from senders in a white list (e.g. the address book, or the provider itself – Hotmail) are considered legitimate, and thus delivered.
Collaborative filtering. When a user tags a message as spam, this is considered spam for users similar to him/her. Alternatively, the service provider considers that massive messages are spam.
Digital signatures. Messages without a digital signature are considered spam. Digital signatures can be provided by the sender or the service provider.
Content-based filtering . The most used method. Each messaged is searched for spam features, like indicative words (e.g. “free”, “viagra”, etc.), unusual distribution of punctuation marks and capital letters (like e.g. in “BUY!!!!!!”), etc.
There is a lot of good info in there. Check it out.
来源:https://stackoverflow.com/questions/8020144/spam-protection-algorithm-for-sms