End user false positive spamassassin administration using webmail?

Thu Nov 8 12:46:50 CET 2007

Hi,

We currently have a Kolab server (2.0.4) implemented in a mixed host 
environment (windows xp, linux and macs using pop3 and imap on various 
different mail clients).  I'm looking into spamassassin options at the 
moment.  I understand that the problem below is not kolab specific, but I 
think it's likely that other Kolab users have had a similar experience.  I'm 
sorry if it is a little long-winded, but I want to cover everything I have 
considered so far.

Our webmail server is on another box and subnet to our main mail server.  It 
is basically a web-based imap client.  Most of our users download their mail 
via pop3, leaving a month's worth of email on the server so they can access 
it via webmail while off site.  There is a long history of pop3 use, and 
migrating all the outlook users (approx 90 users in total) to imap would be a 
long and laborious process.

I hope to use the webmail imap client as a user spam administration page.  The 
plan is that I would set up a folder for each individual eg 'myspam' and 
divert the messages into it that kolab's implementation of spamassassin flags 
as spam.  For the forseeable future, users could log onto webmail and move 
any false positives back into their inbox for download.

The difficulty with this is that spamassasin has not learned that these 
messages are not spam.  

The way I see it I have the following options:

Don't teach spamassassin about false positives.

Rely on me running sa-learn --ham on just my false positives (this asssumes I 
get the same kind of email as the sales department, for example, which I 
don't).

Have spamassassin run sa-learn --ham on users' inboxes at some stage during 
the day, in the hope that there are no false negatives and that the user has 
checked their spam situation via webmail.

Add a button to the webmail interface that says 'is_not_spam' and:

1) Find some way of calling spamassassin remotely from the webmail server for 
specific messages on the mail server (I'm not going to do this - too many 
security concerns).

2) Run a copy of spamassassin on the webmail machine, and find a way to update 
the 'master' spamassassin database from the webmail server.  
This is awkward, because the webmail solution loads emails into a mysql 
database when users log in and caches them there, so I'd be running against 
the modified entries in a DB, or I'm back to number 1.

3) Have an imap folder called 'notspam' that the 'is_not_spam' button moves 
messages into.  This could be polled every 5 minutes and any messages found 
would be passed to sa-learn.  
The difficulty here is that users want their email right away.  I guarantee 
you that if someone is waiting on a message that was incorrectly moved to the 
spam folder, by the time they realise it has been diverted they will not be 
prepared to wait 5 minutes.  Also, I haven't found a straightforward way of 
automatically moving those messages from the 'notspam' folder back into the 
inbox for pop3 download.  This would have to be moved via imap or the cyrus.* 
index files would be inconsistent until I could run a cyrreconstruct.
Manually moving it from the 'notspam' folder into the inbox folder will 
basically make this whole function useless.  People will just move the files 
back into the inbox and bypass the 'is_not_spam' option altogether.

I'm sure lots of other people are in this situation.  I'd welcome any 
thoughts.  

I will also set up a shared 'ourspam' folder that I can moderate as per the 
wiki suggestions, specifically for sa-learn --spam.

Regards,

Simon