Training Spamassassin

Spam is a constant problem. I run spamassassin on my personal mail server and on the mail servers at work. It does a pretty good job of filtering out most of the junk, but on occassion a few spam messages get through. Now when I say it does a pretty good job at filtering I am serious. At work we receive from 5,000 to 10,000 messages a day (depending on the day of the week) and about 80% of these messages are JUNK. Spamassassin does a fine job with hardly any false positives and only a handful of actually spam getting through.

In order to help spamassassin determine that these messages that are getting through are spam you have to train the Bayesian filter and doing this is very simple using sa-learn.

Just save the spam messages that make it through to a seperate mail file and then run sa-learn on the file (where filename is the mail file).

sa-learn –mbox –spam filename

To train for ham (good mail) use:

sa-learn –mbox –ham filename

More detail instructions and options can be found here: http://spamassassin.apache.org/full/3.0.x/dist/doc/sa-learn.html


Technorati:


Leave a Reply