vetaltm Author
Joined: 05 Feb 2006 Posts: 748
|
Posted: Tue Dec 26, 2006 1:27 pm Post subject: Re: high score on non-spam-mails |
|
|
roland wrote: |
with english-mails or better with non german-mails the plugin works perfectly. but the plugin gives sometimes german-mails a very high score
Code: |
Klassifiziert 71.04%;<forum-mail@opera-info.de>;<*****@macnews.de>;Aktivierung der Registrierung bei Opera-Info.de Forum vornehmen.;20 Dez 2006 22:18:01;
|
and i don't know why
|
The plug-in is shipped with a pre-trained classification database, which provides an average filtering quality for english spam. The database doesn?t contain enough german samples, and thereby the described mistakes are possible.
For better classification of english and non-english email you should train the plug-in on your existing email messages. Use the menu items ?Specials | Mark as Junk? and ?Specials | Mark as NOT Junk? (or the corresponding AntispamSniper toolbar items) for teaching the plug-in and soon the classification quality will raise for the messages in all languages.
roland wrote: |
.. also no hits with the black_words list
|
The list of black words is empty by default and the filtering messages by subjects is disabled. Here are the steps for making it work:
- Turn on the checkbox "Enable checking subjects of the incoming messages for black keywords" on "Black keywords in subjects" page.
- Fill up the list of black keywords manually or automatically. If the filtering by black words is turned on, then the plug-in updates the list automatically, on learning the email messages. The filter adds subject keywords to list and updates spam ratio for each word: increments its value if the keyword was met in junk message and zeroes the value if the keyword was met in good message.
On classifying the message header the plug-in selects all keywords from subject with spamminess greater or equal to "Minimum spam ratio for the keyword to use it for blocking". If the number of such messages is greater or equal to "Minimum number of black keywords in subject to block the message", then the message is classified as spam.
You can also download and import the predefined list of black keywords:
http://antispamsniper.com/misc/black_words.txt
This list is not perfect, but it contains the most frequent spam keywords from subjects. Please test the classification of your existing messages before using that list. "Minimum number of black keywords in subject to block the message" is equal to 1 by default, so even one word in subject can produce a false positive. |
|