Spam/Ham

Stefan Froehlich stefan at ffa-it.com.au
Sun Dec 30 01:42:49 CET 2012


Personally I don't like the way to handle spam and ham described in the 
documentation. No user is willing to send real spam to the spam folder 
and real ham to the ham folder.
I'd like to implement another approach which is easier for the user. I'd 
like to learn all mails in the user's spam folder as spam and all mails 
not in the spam folder as ham. I think this is the natural intuitive way 
for a user.
I created a sieve filter which moves all mails tagged as X-Spam-Flag: 
YES to the user's spam folder. Also if a user sees a message in the spam 
folder which he thinks is not spam he simply moves it out from there to 
the inbox (or a subfolder). All these messages should be learned as ham.
I started writing a bash script handle all these things. The idea is (to 
increase speed) to insert another X- Tag into the mail, let's say 
X-<ServerName>-Learned-As: and the possible values are spam or ham. I 
introduced this for performance reasons. A message in the spam folder 
with "X-<ServerName>-Learned-As: spam" should not be learned again (even 
though sa-learn ignore's it, it is a performance question). To do this I 
need to access the mail files somehow. Currently I do this directly on 
the file system.

Now I ran into several problems (All is on Debian Wheezy):
1) A folder does not necessarily contains only valid undeleted mail 
files. Let's say a user moves some mails out of the spam folder the mail 
files are still in the spam folder. I can't see a way how to distinguish 
between real mail files and those that have been deleted already but not 
deleted from the filesystem yet.

2) If I change a file in on filesystem level how can I let cyrus know so 
that it is aware of this change?

3) I somehow corrupted my spam folder. The standard installation of 
kolab doesn't install the reconstruct binary so I was unable to recover 
this folder. Where do I find the reconstruct binary?

4) To increase performance I'd like to react on user's move request 
instead of scanning all mail folders. Is there a way to run a script if 
a user is about to move a message or if a user just has moved a message?

MfG Stefan Fröhlich
42 ;-)




More information about the users mailing list