Oct 28, 2005

A Patch for SpamLookup 2.0 bundled with MT 3.2

Of course I'm addicted to using SpamLookup 2.0, but its Keyword Filter doesn't recognize regular expressions with multi-bytes strings and Unicode properties/blocks/scripts which are supported by Perl 5.8. Therefore it's not easy to train SpamLookup for rejecting comment/trackback spams in foreign languages, especially in asian languages.

The following is a patch for enabling Unicode support in SpamLookup 2.0.

SpamLookup2.0-encode.patch

If you are using a Linux box, it is easy to apply. Just download or copy it into your MT directory and type as follows:

patch -p0 < SpamLookup2.0-encode.patch

Once you applied this patch, you could write regular expressions enpowered by Unicode support, as Keyword Filter rules.

For example, to reject comments/trackbacks with Hiragana strings, just as follows:

/\p{Hiragana}+/

Or to accept only Latin-1 comments/trackbacks, you can do as follows:

/^[^\x00-\xff]+$/

About Me

My Photo

つくばで働く研究者

Total Pageviews

Amazon

Copyright 2012 Ogawa::Buzz | Powered by Blogger
Design by Web2feel | Blogger Template by NewBloggerThemes.com