l i n u x - u s e r s - g r o u p - o f - d a v i s
L U G O D
 
Next Meeting:
October 7: Social gathering
Next Installfest:
TBD
Latest News:
Aug. 18: Discounts to "Velocity" in NY; come to tonight's "Photography" talk
Page last updated:
2007 Jan 04 09:21

The following is an archive of a post made to our 'vox-tech mailing list' by one of its subscribers.

Report this post as spam:

(Enter your email address)
[vox-tech] SpamAssassin training
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[vox-tech] SpamAssassin training



SpamAssassin includes a naive bayesian classifier that can be used to 
recognize spam based on keywords (in a probabilistically trained way). 
The results of classification using the bayesian classifier are boiled 
down into one of several rules: BAYES_00, BAYES_05, BAYES_20, ..., 
BAYES_95, BAYES_99. These rules have statically assigned scores. 
Combined with a whole pelathora of other more complex rules (for things 
like header bugs, DNSBLs, body formatting, etc...) the scores for any 
rules a message triggers are added up and used to determine whether a 
message is actually spam.

The scores for these rules can be customized manually in 
~/.spamassain/user_prefs or systemwide in files in /etc/spamassassin.

Is there any utility for spamassassin that could be used to train the 
scores for all of its rules automatically, in a bayesian or 
support-vector-machine kind of way? Note that I'm not talking about 
training the bayesian filter, as I just explained, I'm curious about 
automatically training the step that comes after the bayesian filter.

--Ken Bloom

-- 
Ken Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu/~kbloom1/

Attachment: pgp00000.pgp
Description: PGP signature

_______________________________________________
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech


LinkedIn
LUGOD Group on LinkedIn
Sign up for LUGOD event announcements
Your email address:
facebook
LUGOD Group on Facebook
'Like' LUGOD on Facebook:

Hosting provided by:
Sunset Systems
Sunset Systems offers preconfigured Linux systems, remote system administration and custom software development.

LUGOD: Linux Users' Group of Davis
PO Box 2082, Davis, CA 95617
Contact Us

LUGOD is a 501(c)7 non-profit organization
based in Davis, California
and serving the Sacramento area.
"Linux" is a trademark of Linus Torvalds.

Sponsored in part by:
Sunset Systems
Who graciously hosts our website & mailing lists!