SIEVE VS SPAM (AND OTHER TECHNICAL UPDATES)
It’s been awhile since I’ve actually written anything for my own site. You’ve all been taken care of by those that can only be described as co-content directors (r?)(s?)hollen and afischer recently. Given this fact I decided that in my time before class this morning I would update you on a few things that I have found to do with myself lately.
First, mail. Since I’m no longer in the IT world anymore, and don’t run my own mail, I’ve found it difficult to really kill all the spam I’d like to. Alec recently posted an article over at thened.net on this issue. Unfortunately, since I don’t run my own MXes anymore, his approach was more or less off-limits to me. Luckily I’ve been using fastmail to manage my vdov.net email, and at least they let me Sieve it up a bit. Sieve (RFC 3028) is an interesting little language used for mail processing on the server side. In the past when I’ve run my own MXes, I’ve used postfix, SpamAssassin and procmail to do most of my mail processing.
Although I’m really not that happy with Sieve, I did succeed in creating a reasonable rule set that blocks about 99% of spam I recieve. I also played around with rss2email, a nice little RSS application that it quite configurable and e-mails me with new articles. I got this idea from Alec, who also uses rss2email. At the top of this file, I file away all my RSS articles into seperate folders, and then I go after spam. I won’t recreate the entire things below, but I’ll give you the highlights.
###STUFF I ACTUALLY WANT
if allof (
address :is "From" "feeds@vdov.net",
header :contains "from" ["Chem", "RSC", "Science", "Aston",
"Physics"]
) {
fileinto "INBOX.rss.chem";
stop;
}
There are more entires for all of my RSS categories, but they more or less look the same as the one above. Then I get rid of all messages over 10 megabytes, reject most dangerous attachments and completely block all mail from a series of spam-littered countries.
###BIG MESSAGES & ATTACHMENTS
if size : over 10000K {
reject "over my 10MB limit. please contact me about sending
this.";
stop;
}
if header :contains "x-attached"
[".exe",".bat",".js",".com",".cmd",".ini", ...]
{
reject "i do not accept attachments of this type.";
stop;
}
###DOMAINS THAT SUCK
if header :contains ["from", "received"]
[".ru ",".jp ", ".kr ", ".pt ",".pl ",".at ",".cz ",
".ru>",".jp>", ".kr>", ".pt>", ".pl>",".at>",".cz>"]
{
reject "i do not accept mail from your country.";
stop;
}
Then I simply block all domains that historically send me spam.
###GLOBAL DISCARD
if anyof (
header :contains ["from", "Received", "X-Sender", "Sender",
"To","CC","Subject","X-Mail-from"]
[ "123greetings", "allfreewebsite.com", ...],
header :value "ge" :comparator "i;ascii-numeric" ["X-Spam-score"]
["16"]
)
{
discard;
stop;
}
Etc, Etc, Etc. The last punch is to use SpamAssassin scores to grab anything that is suspected of being spam and put it in my spam folder. Above I’ve already discarded anything with a score of 16 or above. Since I’ve implemented these rules, I’ve gone from about 100 spam messages a day to no more than 1. Pretty nice 99% decrease.
###GLOBAL FILE
if header :value "ge" :comparator "i;ascii-numeric"
["X-Spam-score"] ["5"]
{
fileinto "INBOX.spam";
stop;
}
While this remains a strikingly inelegant solution, it does work, and I’ve been pretty happy with it so far.
The last and most important note of the day is that smashy is nearing 300 days of uptime. This will occur tomorrow in the late morning before noon. I thought this was something to really write home about until I logged into polar (at Bowdoin) yesterday and saw this …
polar> uptime
10:14 up 428 days, 15:52, 8 users, load average: 0.26, 0.14, 0.11
Damn.
you’ll also notice that i just put up a post preview feature. i’m not sure if i like it yet but i’ll let it stand for a day or two and see what people think. this is in lieu of me doing a proper site redesign, which i don’t have the time for quite yet.
ac;
I DO like the preview. Will we ever see a return of the categories section?
I really like the preview and it will give us much better data on the site views (although RSS feeds will still be a little weird on the stats). I would also like to see the categories back. For example, yesterday my friend wanted to see just the interviews and asked if there was an easy way without scrolling through everything.
PS: How does it determine the length of the preview?
generally: first paragraph. so, for those of you posters out there, that means that i’d prefer NO IMAGES in the top left corner unless your first paragraph is long enough to support it without looking ridiculous.
i probably wont put up the categories section again until i actually do the site redesign.
I like the preview as well. I’m not sure what this categories section is… putting the categories into sections, but I could be waaay off with that…
You could allow more articles to be on the front page with this design. It looks like there are the same number of articles now, but adding more to the front page wouldn’t look bad.
notice the new links ordering and page ordering, which is not supported in wordpress and was quite annoying to get working. shawna requested a reordering and i … well, i was bored waiting around for group meeting tonight.
no, you just like to make me happy.
ha. good one.
also during the course of doing this i got rid of that damn xhtml javascript nicetitle function which really is annoying with most browsers. if you scroll over a post title now you won’t be nearly as annoyed.
also, to update afischer about how it decides what to use as a preview. essentially, it calculates the closest number of paragraphs to 50 words. that is, absolutevalue(#words-50) is minimized. it only considers chunks of paragraphs. so, it at least includes the first paragraph, and MAYBE the 2nd if the function absolutevalue(#words-50) is smaller than that for just the first paragraph.
this of course makes it display almost always just the first paragraph. it isn’t often that the 2nd paragraph is included, unless you’re writing pretty damn short paragraphs. i toyed with an 100 word function but i thought it included too much.
so, in conclusion, from now on … make your first paragraph count. it’s the thing people will see on the main page.
Word, so I am also assuming it will only show a picture if it is completely bounded by the first paragraph (or two very short ones).
no … not quite, that’s why we have to rethink pictures. you can only include them if your text in the first paragraph is long enough to wrap below the picture. otherwise, put it somewhere else, because it will be displayed and it will look weird.
Sorry, that is what I meant by “bounded”
umm … 300 days anyone?
300 days is a long time. I’m surprised you don’t have prosthetic hips by now.
I’m hungry.
and who doesnt love reality TV. I know vdov is a never ending ecstasy ride. But 8th & Ocean, how good looking are those people??????
If i knew one person that loves reality tv it would be Anthony Beardsworth Costa, vdov’s Ringmaster
oh wow. first rhollen: i’m pretty sure smashy does have prosthetic hips by now … if i shut it down, it would probably never be able to stand back up.
bgreenle: shut it.
all: i am also almost ready for a prosthetic hip … 23 in 5 days. nearly out of my early 20s …
oh i almost forgot … i passed my last cumulative exam. 5 in 8 tries. not exactly the most stellar record, especially compared to some in my group (5 in 3, 4 tries), but nothing to sneeze at. at least i’m done 2 tests before the end of my first year.
honestly who “sneezes at” anything these days.
oooooh !! commenting from my treo. I am very cool. I should say that the article stubs make it much easier to navigate vdov via hand-held device. Bravo. However, it is also true that the sidebar links present themselves at the bottom of the whole site and don’t look all that nice.
the CSS/site isnt designed for handheld devices … this is a static width page.
well, hey. get with the times.
also, 10x congrats on your cums. Can I inherit your other chances for when I’m in grad school?
Dddddaaammnnnn technology is really where it’s at. Think NASA will buy me a treo? I could “consult” for them and do cool web 2.0 things for them. Such as telecommute via an online desktop… very AJAXy
no.
I am so sorry you have to use seive.
i was waiting for alec to comment on this one. when i was writing this i contemplated saying “While this remains a strikingly inelegant solution, it does work, and I’ve been pretty happy with it so far. Alec I’m sure will make fun of me for this, but I don’t have another option right now.”
You’d, however, be proud of what I’m doing with emacs/viper/latex and C right now.
I don’t know of a good user-configurable server-side mail sorting thingie – sieve is the only game in town?
So costa, you are saying that you are proud of what you are doing with “emacs/viper/latex and C right now.”? I am not sure what to think of this except that you must have quick IT to get started in the porn business. Anyways I hope all is going well.
haha … no, well, i did quit IT but i’m still doin calculation-based stuff … so i still get my fix in some way or another.