Spam is a constant problem. I run spamassassin on my personal mail server and on the mail servers at work. It does a pretty good job of filtering out most of the junk, but on occassion a few spam messages get through. Now when I say it does a pretty good job at filtering I am serious. At work we receive from 5,000 to 10,000 messages a day (depending on the day of the week) and about 80% of these messages are JUNK. Spamassassin does a fine job with hardly any false positives and only a handful of actually spam getting through.
In order to help spamassassin determine that these messages that are getting through are spam you have to train the Bayesian filter and doing this is very simple using sa-learn.
Just save the spam messages that make it through to a seperate mail file and then run sa-learn on the file (where filename is the mail file).
sa-learn –mbox –spam filename
To train for ham (good mail) use:
sa-learn –mbox –ham filename
More detail instructions and options can be found here: http://spamassassin.apache.org/full/3.0.x/dist/doc/sa-learn.html
Technorati: spamassassin learn spam apache mail messages bayesian sa learn





