l i n u x - u s e r s - g r o u p - o f - d a v i s
Next Meeting:
July 7: Social gathering
Next Installfest:
Latest News:
Jun. 14: June LUGOD meeting cancelled
Page last updated:
2003 Sep 23 10:49

The following is an archive of a post made to our 'vox-tech mailing list' by one of its subscribers.

Report this post as spam:

(Enter your email address)
Re: [vox-tech] bogofilter newbie
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vox-tech] bogofilter newbie

--On Tuesday, September 23, 2003 07:26:08 -0700 p@dirac.org wrote:

1. update bogofilter's wordlists with every incoming message, using the
   -u option.  if i understand it, -u will first classify the spam, then
   update bogofilter's wordlist.  that seems like asking for trouble.
   if you filter to /dev/null based on bogofilter's output, how do you
   correct mistakes?  and it seems like mistakes here will cause more
   mistakes in the future.

   i assume you do this with:

   | bogofilter -f -p -u -l -e -v

   also, shouldn't there be a "c" in the procmail colon line?  how does
   mail get past this recipe?  isn't it considered "delivered" when an
   email matches a recipe unless you use ":0c"?
A procmail recipe tagged with "f" is a filtering recipe. Procmail pipes the message through the specified program, then continues on using the filtered version of the message. It's not a delivering recipe, so "c" isn't needed.

I seeded bogofilter just like you did. I use maildirs for my email so every message is in a separate file, so I built a big list of every message less than a year old, divided them into spam & non-spam, and piped each set into bogofilter.

Incoming mail is piped through this set of rules:

:0 fw
| /usr/bin/bogofilter -u -2 -p -e

# Spam? Save it in the spam folder
* ^X-Bogosity: (yes|spam)

It's a good idea to collect your spam rather than deleting it. You might want to delete your wordlist one day and build a new one; you'll need a collection of current spam to do that. More important, any time bogofilter makes a mistake you need to correct it, whether it was a false positive or false negative. I can't remember the last time I found non-spam in my spam folder, but it does happen from time to time.

You'll need to find a method of feeding mail back into bogofilter that works for you. I copy the mail into a special mailbox that's swept by a cron job several times per day. These messages are fed back into procmail using a special set of rules:

# Messages labelled spam. Tell bogofilter it's not, and save to INBOX
* ^X-Bogosity: (Spam|Yes)
:0 c
| /usr/bin/bogofilter -Sn


# Messages not labelled spam.
:0 E
:0 c
* ^X-Bogosity: (ham|no)
| /usr/bin/bogofilter -Ns


Note I'm not using bogofiler as a filter this time. Without -p (passthrough mode) it won't output a new copy of the message with the corrected spam header.
"We actually do 100,000 pages or more a day in Bork"
-- Marissa Mayer, Google
Kenneth Herron Kherron@newsguy.com 916-366-7338
vox-tech mailing list

LUGOD Group on LinkedIn
Sign up for LUGOD event announcements
Your email address:
LUGOD Group on Facebook
'Like' LUGOD on Facebook:

Hosting provided by:
Sunset Systems
Sunset Systems offers preconfigured Linux systems, remote system administration and custom software development.

LUGOD: Linux Users' Group of Davis
PO Box 2082, Davis, CA 95617
Contact Us

LUGOD is a 501(c)7 non-profit organization
based in Davis, California
and serving the Sacramento area.
"Linux" is a trademark of Linus Torvalds.

Sponsored in part by:
O'Reilly and Associates
For numerous book donations.