l i n u x - u s e r s - g r o u p - o f - d a v i s
L U G O D
 
Next Meeting:
April 21: Google Glass
Next Installfest:
TBD
Latest News:
Mar. 18: Google Glass at LUGOD's April meeting
Page last updated:
2003 Nov 06 10:36

The following is an archive of a post made to our 'vox-tech mailing list' by one of its subscribers.

Report this post as spam:

(Enter your email address)
Re: [vox-tech] Training spamassassin's bayenessian filter
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vox-tech] Training spamassassin's bayenessian filter



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thursday 06 November 2003 08:58 am, p@dirac.org wrote:
> On Thu 06 Nov 03,  8:29 AM, R. Douglas Barbieri <doug@dooglio.net> said:
> > On Wed, Nov 05, 2003 at 09:59:12PM -0800, Ryan Castellucci wrote:
> > > -----BEGIN PGP SIGNED MESSAGE-----
> > > Hash: SHA1
> > >
> > > On Wednesday 05 November 2003 09:24 pm, Ken Bloom wrote:
> > > > Will SpamAssassin's bayenessian be more effective if I train it on
> > > > every message that comes through (even ones that it's built in tests
> > > > have already rejected as spam) or only on false negatives?
> > >
> > > Yes, it's much more effective if you train it on all messages.
> >
> > Woah. Dumb question, but when did SpamAssassin go Bayesian? It's one of
> > the reasons I switched away from it to Bogofilter.
>
> i was wondering the same thing.  it's actually a little difficult
> finding references to bayesian filtering on sa's website.  if you do a
> google search, most of the results are on LUG mailing lists.
>
> according the sa site, version 2.5 had it.
>
>
> the version i'm using on one of the accounts i own on someone else's
> machine, 2.43, didn't have it.
>
> that's pretty cool.  maybe someday /. will have a "bayesian filter
> shootout" to see who's most effective.   ;-)   but to be honest,
> bayesian filtering along with lexical parsing seems to be the most
> effective (incoming mail to dirac has both).  sa's lexical filtering,
> for me at least, only catches the most obvious spams.  i've had to bump
> up some of the score results to get anything resembling effective.  i'm
> glad they introduced this new functionality.
>
> pete

The other neat thing spam assassin can do, with bayesian filtering, is 
autolearning. If the score is above or below a configurable level, it 
automaticaly trains on it, as spam or ham respectivly.

For example....

X-Spam-Status: No, hits=-10.9 required=6.0
        tests=EMAIL_ATTRIBUTION,HABEAS_SWE,IN_REP_TO,KNOWN_MAILING_LIST,
              PGP_SIGNATURE,QUOTED_EMAIL_TEXT,REFERENCES,
              REPLY_WITH_QUOTES
        autolearn=ham version=2.55

Unfortantly, there is no way for me to train the instance of spamassassin 
running at my ISP.

- -- 
PGP/GPG Fingerprint: 3B30 C6BE B1C6 9526 7A90  34E7 11DF 44F3 7217 7BC7
On pgp.mit.edu, import with `gpg --keyserver pgp.mit.edu --recv-key 72177BC7`
Also available at http://www.cal.net/~ryan/ryan_at_mother_dot_com.asc
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/qpAcEd9E83IXe8cRAjcrAJ9DJhwHrHHEQROX2cEu0Cr8L1Tx4QCeJjF4
9suAKYZ1USRUSWdfK/x79XA=
=r3R6
-----END PGP SIGNATURE-----
_______________________________________________
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech



LinkedIn
LUGOD Group on LinkedIn
Sign up for LUGOD event announcements
Your email address:
facebook
LUGOD Group on Facebook
'Like' LUGOD on Facebook:

Hosting provided by:
Sunset Systems
Sunset Systems offers preconfigured Linux systems, remote system administration and custom software development.

LUGOD: Linux Users' Group of Davis
PO Box 2082, Davis, CA 95617
Contact Us

LUGOD is a 501(c)7 non-profit organization
based in Davis, California
and serving the Sacramento area.
"Linux" is a trademark of Linus Torvalds.

Sponsored in part by:
O'Reilly and Associates
For numerous book donations.