spam filters as a general sorter

Hui Zhou zhouhui at
Sat Mar 12 06:17:47 PST 2005

I have been reading Paul Graham's essays on spam filters and amazed at 
the effectiveness of his statistical filters. 

I haven't encoutered a big spam problem (I guess I am not popular 
enough yet) However I do have huge amount of mails that come into my 
mailboxes: tons from mailinglists, and quite a few from my banks, my 
universities, my friends, and a bunch of opt-in promotions, alerts 
etc. Most of them don't qualify as spam, however, large percent of my 
mails I don't want to read promptly, and some portion of my mail I 
only read from time to time and skip most of the time. 

My current strategy is to use procmail to sort my mails into different 
mailboxes (over a dozen atm and growing larger). However, it still 
annoys me because, for example, the most offen read inbox -- lfschat 
still contains only very small portion of mail that I am really 
interested in reading.

So during reading Paul's essay, I got this idea, apply the statistical 
filter to all my mails to not only just two categories, but several 
categories: such as Spam, Interesting, Advertisement, AccountUpdate, 
StrangeLogEventsAndAlerts, PrivateMustRead, MildInterest, 
LeastInterest... etc.

Apparently the simple minded token treatment in Paul's essay may not 
be quite effective against non-spam categories, but without actually 
tring it out, who knows, it may amaze me. 

Any comments? 


Hui Zhou

