spamassassin, procmail, grep yada yada yada

Following on from yesterday’s housekeeping I updated my spam filters today.

I use spamassassin in combination with procmail which generally rocks.

It features Bayesian Filtering which is described neatly in A Plan For Spam.

The only problem is it requires occasional manual training.

This involves going through my spam mail folder and checking that everything that was automatically tagged as spam by spamassassin was actually spam, and then “teaching” spamassassin’s Bayesian filter to regard it all as spam.

The more this is done the better spamassassin becomes at recognising spam.

The converse of this is that I have to run it over my inbox from time to time to tell it that my inbox isn’t spam.

As I checked each spam I noticed some of the scores that spamassassin had allocated them.

I flag any email with a score over 5 as spam, but I saw some as high as 20.

I became curious as to the highest so I grepped my spam folder and found out the highest was a score of 29.20 points!

I looked at the mail that triggered such a high score and it read like a spammer’s guide to annoying people.

It was selling software to enable me to spam people more effectively, how ironic.

For the geek curious, the command line I used to find out the score was:

grep "^Content analysis details" ~/Maildir/misc/spam/cur/* | grep -o "[[:digit:]]*\.[[:digit:]]* points" | sort -u -n

grep -o is the dog’s nads by the way. 🙂

One thought on “spamassassin, procmail, grep yada yada yada”

  1. I’m curious – how’s your spamassassin holding up now that the spammers are onto bayesian filtering and have started adding "salt" to their spam to fool this sort of filtering ?

    I just want to add, that TMDA is still holding out at 100% effective :)

Comments are closed.