Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Bogotrainer Thread
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Unsupported Software
View previous topic :: View next topic  
Author Message
bubbas
n00b
n00b


Joined: 29 Dec 2003
Posts: 36
Location: Germany

PostPosted: Sat May 07, 2005 1:13 pm    Post subject: Bogotrainer Thread Reply with quote

Bogotrainer

I openend a new thread for the Bogotrainer script from the Email System For The Home Network - Howto

The original script was written by Chris Smith.

I) What is it?
--------------------------------
Bogotrainer is a little helper to automate the training of your bogofilter
spamfilter solution. It takes your mailfolder and registers all spam and ham you
have in the specified directories in the bogofilter database. On future runs it
only corrects missdetected spam or ham which you have moved to a corresponding
folder.


You can find all the information on the project homepage: Bogotrainer-Homepage

If you have any suggestions, questions or problems don't doubt in posting here. I will do my best!

News:
07.05.05 Bogotrainer 3.0b released

Changelog:
2005-05-07
* Version 3.0b
* added support for multiple spamfolders
* independent spam correction folder
* fixed problem running as cronjob
* moved configuration to config.py
* completly rewriten in OOP
* added support for fast mail registering (bogofilter bulkmode)
* added possibility of logging to logfile
* added support for forcing database overwrite with backup
* debug mode possible
* silent mode possible
* added support for commandline interface
* added version information in commandline
* added checkmode

2005-03-22
* Version 2.1.1
* fix for special characters in imap folder names

2005-03-06
* Version 2.1
* moved md_bgt.py to md_output.py
* added more tests for checking if all needed directories exists
* moved directory tests to md_dirtest.py, for more clarity
* added support for imap directories containing whitespaces
* fixed broken specific Hamfolder (.Ham) support
* fixed missing import sys in md_output.py

2005-01-06
* Initial public release 2.0
Back to top
View user's profile Send private message
dgrant
Apprentice
Apprentice


Joined: 28 May 2003
Posts: 158
Location: Vancouver, BC, Canada

PostPosted: Tue May 10, 2005 7:06 pm    Post subject: Reply with quote

The original bogotrainer was so simple. I could read the code and figure out exactly what was going on. Now bogotrainer is so long, I started to look at it but I just don't have the patience. What does this do compared to the original that I really need? And does it filter the courierimaphieracl and courierimapkeywords directories yet? I had to modify the script to do this.
Back to top
View user's profile Send private message
dgrant
Apprentice
Apprentice


Joined: 28 May 2003
Posts: 158
Location: Vancouver, BC, Canada

PostPosted: Tue May 10, 2005 7:10 pm    Post subject: Reply with quote

That being said, bogotrainer is wonderful. But I wonder if the new one is too complex for new users to get up and running. The simple one a copied and pasted from the HOWTO was so easy to get going and so easy to debug when I noticed there was a small problem. I will probably try new version this eventually.
Back to top
View user's profile Send private message
bubbas
n00b
n00b


Joined: 29 Dec 2003
Posts: 36
Location: Germany

PostPosted: Tue May 10, 2005 11:20 pm    Post subject: Reply with quote

ok you are right its much more complex!

When i started with the EmailHowto and bogotrainer,i ran into the same problem you mentioned. And you know i hadn't looked into python before. But hey the code was cool i understood it almost without knowing nothing about it. So i corrected the problem with the courierfolders, and i liked python so much that i decided to learn it. But what could i do to practice? I started writting nice output on the console for bogotrainer and decided to post it for other people who perhaps wanted a version running without modification. But then there where some problems and some people asked me for support of special characters and other features. So i started writing more and more functions and thats why it is now that complex. But hey it should work! The main part of the script is output, logging, debugging and checks for existing folders, etc. The important functions are still the same. It is now completly Object oriented so it should be readable.

Here a small overview:

bogotrainer.py - start the instances of the objects (OOP Python)
md_io.py - just input output for console
md_tests.py - check for exsisting directories (mailfolder, bogofilter, spamcorrectionfolders, etc.)
md_trainer.py - the really important functions (spamregistering, hamregistering, correcting messages)

I know what you mean, it is difficult to read the code written by someone else which is larger than a few lines! So thats the disadvantage for the features i have added. Most people won't need them but for people who haven't got programing experience it is easier to just add foldernames in the configfile than hack them directly on the script ...

additional features (compared to original script):
* totally configurable with config.py file
(i.e the courierimaphieracl and courierimapkeywords directories or other in future just put them there)
* multiple spamfolder Support
* support for list of folders to ignore
* fast mail registering for huge amounts of mail (-s on commandline)
* nice output
* many checks if all necessary directories exist
* logging possibility
* debug possibility
* support for special characters in foldernames
* backup database
* possibility to create new database even if old exists (-f forcemode on commandline)

I' m really happy Chris wrote the original script, it is much more readable than my code is now. It was a learning process for me adding the features and i just wanted to post it here for the case it is usefull for someone. I don't want to convince nobody to use this script and not the original! All thx to Chris!!

but thanks for your posting! i really understand what you want to say!!

cu

vale
Back to top
View user's profile Send private message
asimon
l33t
l33t


Joined: 27 Jun 2002
Posts: 979
Location: Germany, Old Europe

PostPosted: Wed Jun 08, 2005 10:22 pm    Post subject: Reply with quote

Cool! It would be nice to have this included in the bogofilter ebuild.
Back to top
View user's profile Send private message
bubbas
n00b
n00b


Joined: 29 Dec 2003
Posts: 36
Location: Germany

PostPosted: Wed Jun 08, 2005 10:39 pm    Post subject: Reply with quote

:D

I don't know about ebuilds :oops:
Back to top
View user's profile Send private message
slomo
n00b
n00b


Joined: 10 Jan 2004
Posts: 27

PostPosted: Thu Jun 09, 2005 3:53 pm    Post subject: Reply with quote

I had used bogofilter for some time, running all the scripts to teach what was spam/ham and whatnot, It finally got to the point that there where more false positives and spam slipping thru, I ditched it and went back to straight procmail filtering rules, much more simplfied in my judgement.
I would recommend to stick with something that is easy to understand and there is no need to train it, it's pretty smart with some recipe setup, just DAGS on procmailrc.
Back to top
View user's profile Send private message
bubbas
n00b
n00b


Joined: 29 Dec 2003
Posts: 36
Location: Germany

PostPosted: Thu Jun 09, 2005 6:36 pm    Post subject: Reply with quote

no false positives here, and not detected spam about 1 every day from ca. 100 daily mails ...

not bad result for me!

cu

vale
Back to top
View user's profile Send private message
rpmohn
Tux's lil' helper
Tux's lil' helper


Joined: 26 Aug 2003
Posts: 116
Location: Vermont

PostPosted: Thu Feb 09, 2006 4:46 pm    Post subject: Reply with quote

Nice! I just upgrade from some unknown bogotrainer version from Dec, 2003 to v3.0b. I like the logging feature very much.

I use this in my procmail right now:
Code:
:0fw
| /usr/bin/bogofilter -l -u -e -p -o 0.6
The "-o 0.6" adjusts the cutoff point of what's spam and and what's not and the "-l" writes logging to the syslog. I use the syslog logging to see how many HAMs and SPAMs I get each day, and now I'm hoping to use your bogotrainer log to track how many FNs and FPs are identified each day.
-RPM
Back to top
View user's profile Send private message
dgrant
Apprentice
Apprentice


Joined: 28 May 2003
Posts: 158
Location: Vancouver, BC, Canada

PostPosted: Fri Feb 10, 2006 3:54 am    Post subject: Just upgraded as well Reply with quote

I just upgraded to bogofilter 3.0b from someone old unknown version. It works great! Upgrading was easy because the paths I used were the same as the original, and the 3.0b uses the same paths again. Excellent program.
Back to top
View user's profile Send private message
bubbas
n00b
n00b


Joined: 29 Dec 2003
Posts: 36
Location: Germany

PostPosted: Sat Feb 11, 2006 6:12 pm    Post subject: Reply with quote

nice to read that someone still uses it :-)

let me know if you have problems or want something implemented ...

salu2

vale
Back to top
View user's profile Send private message
Nimo
Tux's lil' helper
Tux's lil' helper


Joined: 23 Nov 2003
Posts: 111

PostPosted: Sun Sep 03, 2006 9:36 am    Post subject: Reply with quote

Could it be possibly to add support for maildrop as mailfilter instead of procmail?
_________________
//Nimo
Back to top
View user's profile Send private message
bubbas
n00b
n00b


Joined: 29 Dec 2003
Posts: 36
Location: Germany

PostPosted: Sun Sep 03, 2006 10:55 am    Post subject: Reply with quote

hey,

bogotrainer does not care about your mailfilter. I just didn't mention other mailfilters in the documentation cause i don't use it.
So it should work like mentioned on the Bogofilter Homepage: http://bogofilter.sourceforge.net/man_page.shtml
Quote:

This one is for maildrop, it automatically defers the mail and retries later when the xfilter command fails, use this in your ~/.mailfilter:
Code:

xfilter "bogofilter -u -e -p"
if (/^X-Bogosity: Spam, tests=bogofilter/)
{
  to "spam-bogofilter"
}


change "spam-bogofilter" to your spam directory!

good luck

bubbas
Back to top
View user's profile Send private message
Nimo
Tux's lil' helper
Tux's lil' helper


Joined: 23 Nov 2003
Posts: 111

PostPosted: Sun Sep 03, 2006 12:19 pm    Post subject: Reply with quote

But procmail is called two times inside of md_trainer.py, what are the right args to put to maildrop there?
_________________
//Nimo
Back to top
View user's profile Send private message
bubbas
n00b
n00b


Joined: 29 Dec 2003
Posts: 36
Location: Germany

PostPosted: Mon Sep 04, 2006 11:18 pm    Post subject: Reply with quote

hey,

yes sorry! you are absolutely right. so much time has passed since i wrote bogotrainer....

I've looked at it. Unfortunately i'm far away from my gentoo machine at the moment and i have never used maildrop.
I think the following should do it, as far as i understand the maildrop man-page:

edit your ~/.mailfilter like described in my posting above

then in the md_trainer.py file replace both lines:
Code:
os.system("/usr/bin/procmail -d $USER < "+msgpath_esc)


with
Code:
os.system("/usr/bin/maildrop -d $USER < "+msgpath_esc)


i hope that does the job. Test it with some mails in you spam and ham folder ...

I will add support and test it next month!

good luck :)

vale
Back to top
View user's profile Send private message
Decibels
Veteran
Veteran


Joined: 16 Aug 2002
Posts: 1623
Location: U.S.A.

PostPosted: Mon Sep 04, 2006 11:45 pm    Post subject: Reply with quote

I've been using bogofilter for long time with Kmail. Works pretty good. Have never had a false positive yet. Still I have them sent to trash, just
to take a look and make sure. Been thinking about just deleting them now that it has worked good for so long.

I also setup a cronjob to train it, so I just add new spam it misses once in a while, to the isspam folder and don't have to worry about training cause
it's done once a day by the cronjob.
_________________
Support bacteria – they’re the only culture some people have.”

– Steven Wright
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Unsupported Software All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum