antispamsniper.com Forum Index antispamsniper.com
The reliable anti-spam protection
 
 FAQFAQ   SearchSearch     ProfileProfile   Log inLog in   RegisterRegister 

To train ASS

 
Post new topic   Reply to topic    antispamsniper.com Forum Index -> AntispamSniper for TheBat!
View previous topic :: View next topic  
Author Message
Sacles



Joined: 09 Nov 2007
Posts: 51
Location: Belgium (near Li?ge)

PostPosted: Sun Nov 11, 2007 8:36 am    Post subject: To train ASS Reply with quote

Hello,

I can load more than 2000 spams (*.eml).

To train ASS, is it useful to import them in The Bat! and then to classify them as spams?

Address: http: //foxmail.free.fr/dl/spamagogo/

I find this address on the French-speaking site of Foxmail.
Back to top
View user's profile Send private message
Sacles



Joined: 09 Nov 2007
Posts: 51
Location: Belgium (near Li?ge)

PostPosted: Tue Nov 13, 2007 5:22 pm    Post subject: Reply with quote

No reply?
Back to top
View user's profile Send private message
vetaltm
Author


Joined: 05 Feb 2006
Posts: 751

PostPosted: Wed Nov 14, 2007 4:16 am    Post subject: Re: To train ASS Reply with quote

Sacles wrote:
I can load more than 2000 spams (*.eml).
To train ASS, is it useful to import them in The Bat! and then to classify them as spams?
Address: http: //foxmail.free.fr/dl/spamagogo/

Yes, it makes sense to train the plug-in on additional spam messages. But please consider the following:
- Make sure that a message is truly spam before training the plug-in on it. Some phishing and "social engineering" spam messages can contain a lot of non-spam text and it may impair the overall classification quality.
- The plug-in must be trained on both ham and spam messages. The algorithm makes its best to avoid "overtraining", but it is not good when the database contains too much spam and too little ham messages.
- The best classification quality can be reached after training the plug-in on his mistakes, i.e. training the plug-in on your own messages, classified with the wrong spam ratio.
Back to top
View user's profile Send private message Send e-mail
Sacles



Joined: 09 Nov 2007
Posts: 51
Location: Belgium (near Li?ge)

PostPosted: Wed Nov 14, 2007 4:34 am    Post subject: Reply with quote

Hello,

Thank you for these advices.

------------

Does not the learning file risk to become too heavy?

Can I assure a maintenance of this file (without erasing it completely)?

For example, Spamihilator can compact the learning filter.
Back to top
View user's profile Send private message
vetaltm
Author


Joined: 05 Feb 2006
Posts: 751

PostPosted: Wed Nov 14, 2007 5:32 am    Post subject: Reply with quote

Sacles wrote:

Does not the learning file risk to become too heavy?
Cannot we assure a maintenance (without erasing it completely)?
For example, Spamihilator can compact the learning filter.

Here are some additional points related to training to make things more clear:

- The plug-in classifies messages before adding them to classification database. The ham messages having low spam ratio and spam messages with high spam ratio are not added to the main classification database. The plug-in stores the correctly classified messages as "hints" in a separate database.

- The messages from "hints" database can be added to the main classification database in cases when the plug-in needs to improve the filtering quality after learning some new messages.

- The "hints" database is deleted periodically, whereas the main classification database contains only a minimum subset of learned messages, required to provide the best filtering quality. Thereby the database files will not become "heavy", unless it is absolutely necessary.

- When I wrote "overtraining" above, I meant the balance between ham and spam messages in the plug-in databases, not the overall quantity of messages from both classes. The plug-in decisions are based on spam and ham samples in it's database. If the database contains mostly ham or mostly spam messages, the plug-in cannot distinguish the messages of different classes with high confidence. So it is important to train the plug-in on the messages from both classes to make it "know" more about the differences between spam and ham.
Back to top
View user's profile Send private message Send e-mail
Display posts from previous:   
Post new topic   Reply to topic    antispamsniper.com Forum Index -> AntispamSniper for TheBat! All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group