So I’ve just read this bug (way too late I know..) and wondered why I think about these things a bit differently. Maybe it’s a good thing, maybe it’s bad but this has been killing the badsite*foo.tld spam since day 1 one for me, about 5 days now IIRC. The idea is that is spots any weirdness in a URL before the domain name terminator (or end of the string if one is not present).
Adjust your score as you see fit. It will FP on IDNs and such.
Feel free to drop me your masses results for it in a comment.


2 Comments Received
January 26th, 2007 @10:22 am
That 404’s. why don’t you just paste it to the bug?
also, yes, a bit late
January 26th, 2007 @8:42 pm
Oops – missing a / Sorry about that. The bug was was closed I think.
Leave A Reply