Starbeamrainbowlabs

Stardust
Blog

Fighting Spam on your blog

Since I have written my own blog script from scratch, I have learnt a lot about how spambots spam my site in order to implement measures to stop them. This post is a compilation of all the methods that I have discovered so far.

Currently I have yet to rate the effectiveness of each of these measures since at the time of writing this post I have only just finished rewiring the commenting script so that I can 'measure' the effectiveness of each of the methods described below.

Method 1: Honeypots

If you don't take either an email address or a web address on your blog, try adding a email or website field and hiding it via CSS. The more complex, indirect, and obscure the CSS you hide it with, the better. Just make sure that is actually hidden.

This blog uses a hidden website field along with a warning for users who see it due to poor browser support.

Method 2: No super long comments

This isn't really a proper method, but I found that spam comment on my blog were generally really long. So I am imposing a 2000 character limit on comments. If people have more to say, then they can reply to their own comment, and use service like pastebin or hastebin for code.

Method 3: Keys

This is the really important one. I was finding that while the above 2 methods were stopping some of the spam, I was getting some smart spambots with chrome/firefox-like user agent strings that I can only summarise knew how to tell whether a from control was hidden or not by reading the CSS or my website.

The hidden key field is basically a timestamp of when page was served to the user by the server. In it's simplest form, it can just be the output of PHP's time() function.

In this blog, however, the timestamp is run through a number of different functions, such as base64_encode() and strrev(). Pick a few string manipulation functions that are reversible.

This timestamp can then be analysed by the server. If the timestamp is too far in the past (say 24 hours old), or under 10 seconds old, then the comment is rejected. Spambots will either fetch and cache your page for longer than 24 hours, or they will fetch your page and post a comment immediately. As soon as I set this blog to reject comments posted within 10 seconds of loading the page, I haven't had a single spam comment :)

Summary

So there you go: 2 1/2 methods to banish spam on your blog - for now. The real secret here to log as much information about your commenters as possible (in my case I have been capturing the contents of $_POST, $_GET, and $_SERVER) and working your way through it comparing the requests of legitimate commenters and spammers. The above are simply exploits of the differences I found (with some help from Google). If you can think of any more tricks, please post a comment below!

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blender blog bookmarklet booting bug hunting c sharp c++ challenge chrome os cluster code codepen coding conundrums coding conundrums evolved command line compilers compiling compression containerisation css dailyprogrammer data analysis debugging demystification distributed computing dns docker documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions freeside future game github github gist gitlab graphics hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs learning library linux lora low level lua maintenance manjaro minetest network networking nibriboard node.js open source operating systems optimisation own your code pepperminty wiki performance phd photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference releases rendering resource review rust searching secrets security series list server software sorting source code control statistics storage svg systemquery talks technical terminal textures thoughts three thing game three.js tool tutorial tutorials twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 worldeditadditions xmpp xslt

Archive

Art by Mythdael