« Tiny Tunes | Main | A Wiki Spam Story »

Why rel="nofollow"? I want rel="spammer"!

Google and a bunch of the blog vendors have introduced a way of anesthesizing URLs in blog comments so that they don't add PageRank. Just put rel="nofollow" in your link, and it won't count. (See, for instance, Google leads the industry in fighting comment spam and Support for NoFollow.)

The idea is that it will discourage comment spammers, because for any software they spam that supports the scheme, the spammer won't get any PageRank benefit. However, as Sunir Shah points out, and Mark Pilgrim predicted for a similar scheme, the cost of spamming blogs is so low that spammers will still just spray and pray, with a decreasing but still non-zero return. This happens now, for instance, when my middle-aged MT 2.661 blog gets comment-spammed even though all the outgoing comment links go through a redirect script which kills any PageRank value for the link.

No, what I want is rel="spammer". I want to be able to make a long blacklist with links that *subtract* PageRank for anybody stupid enough to spam me.

Say each link is a vote, in an election for which sites are best. rel="nofollow" is like abstaining; rel="spammer" is a big fat no vote. Which do you think would be more powerful?

Sure, people would try to abuse the heck out of rel="spammer". Microsoft zealots (say) might link to Apple.com with rel="spammer". Apple enthusiasts could do the same back to Microsoft.com.

So what? I say that would be noise that Google (or some even smarter search engine) could filter out with fancier link- and page-ranking algorithms. There'd be a heck of a lot of useful signal, too, and I think Google (or whoever) could distinguish between smart and dumb blacklisters.

rel="spammer" would work proactively to squelch comment spam and -- what I really care about -- wiki spam.

TrackBack

TrackBack URL for this entry:
http://127.0.0.1/mt/mt-tb.cgi/397

Listed below are links to weblogs that reference Why rel="nofollow"? I want rel="spammer"!:

» a better google band-aid? from WatermelonPunch, the Blog - Sideblog
Peter Kaminski: Why rel="nofollow"? I want rel="spammer"! [Read More]

» a better google band-aid? from Watermelon Punch, the Side-blog
Peter Kaminski: Why rel="nofollow"? I want rel="spammer"! [Read More]

» a better google band-aid? from Watermelon Punch, the Blog - Side-Blog
... [Read More]

Comments

People abusing that system are no problem; there's a limit to zealotry. Spammers abusing that system, however, are a problem: They would simply program their bots poison the whole Web with zillions rel="spammer" links to perfectly legitimate sites.

Thanks for the comment, Matthias.

A smart ranking algorithm wouldn't have problems with rel="spammer" poisoning. It would have a reasonably large whitelist of legitimate sites; if a few over-zealous rel="spammer" links to those sites occur on a page, it could decide the whole page is suspect, and discard ranking information for all the links on the page, and perhaps even related pages or all pages from that domain.

So the issue for the blog or wiki maintainers becomes retaining control of all the rel="spammer" links on their pages -- but they should have that anyway. The maintainer would have to be careful when the rel="spammer" tag is applied -- it couldn't be for all links in all comments, but only applied to spam links after automated or even manual spam detection.

One problem I see is that you would have to determine spam at the link level, unlike most plugins which do so at the post level. Why? Because if I'm a spammer I'd just add a link to the site I was spamming as well as my own site. If you flag the post and all it's links as spam, you will also hurt your own site.

You could have a filter that doesn't apply it to your own site. That is easy enough, but you would also have to check a massive online white list. If I was a spammer I would deter google from implimenting the standard by putting a link to legitimate sites as well in my post.

Obviously if a site admin could distinguish stuff judiciously, but if a site admin sees a spam filled post they are going to want to delete it.

Brad, yes. You do need to handle spam at the link level. Your spam tool could aggregate a bunch of links for you to look at and then you'd have to select the ones that get the rel="spammer" link.

But I don't think that's unreasonable -- that's the way MT-Blacklist adds URLs from a spam message to your blacklist, and similar to what I've done on my wiki, where a couple of spammers have tried to keep themselves out of the blacklist by mixing their links with a bunch of legitimate ones.

Score! Just put this on my site, and hope that with widespread use it will become an effective tool. This is the good fight though-maintaining the free flowing of information on the internet by preventing spam from making it too annoying to be useful.

Hmmm... This is NOT the fix I would've hoped for from Google. (But as I'm sure you know, Mr. Kaminski, I'm generally expecting to be disappointed by Google, on any given subject at any given time. hehe.)

I think this is a case of throwing the baby out with the bathwater.

Your rel=spammer thing is kind of along the lines of what I would've hoped for. (Though damn, you beat me to coming up with a funnier take on it. haha.)

If Google wants so badly to cooperate with blogging system programmer type people... (not sure if that sounds right)... Why couldn't Google just tap into the blacklist lists, and work with those to down Google ranking of spammer sites?

I mean, Matthias is right, that kind of thing is exploitable.
Kind of like this blog I visit (Cider Press Hill), their automatic blacklist that came with the blogging software had blacklisted a blogger (Adam Kalsey) who blogs a lot about anti-spam issues. I immediately realized this was an exploitation - that some spammers didn't like Kalsey's negative press on them...
But I agree that Google, of all web entities, should be able to come up with a work-around for that kind of thing!

I'm with Matthias on this one - I don't want comment spammers coming to my site to raise their own PageRank and I don't want them coming to my site to lower their competition's PageRank. Both disrupt the comments on my site.

Also, what happens if someone comment spams my site without me knowing? Right now I get increased PageRank. With rel="nofollow" I get no change in my PageRank. With rel="spammer" I get labeled a spammer and my PageRank goes down.

Do you want Google having to decide what is a legitimate site or not? Comment spammers discuss this stuff a lot, they'd figure out pretty quickly which sites are on the list and avoid linking to them.

"Do you want Google having to decide what is a legitimate site or not?"

I think they already do.

And I think abandoned blogs ought to have their page rank nerfed.

I guess I'm with Kalsey on this one:
http://kalsey.com/2004/07/new_comment_spam_technique/
"In the war on spam if you are not for us; if you choose to look the other way and allow spammers to use your site; if you feel that keeping your site free from spam is too much trouble — you are against us."

hehe.

After all, how totally fantastic could information be on an abandoned web site anyway?

(I'm saying this as more of a search engine user than someone who cares about my own site's ranking... which is ridiculously high actually.)

rel = spammer? let me make a link farm linking to this site using that tag and you can tell me if you think its a good idea

Hi anon@anon.com (if that's even your real email address... :-)!

As I wrote in my post, I think it would be reasonably straightforward for a service with a good purview of the web (such as Google) to distinguish between rel=spammer linkfarms that identify real spammers, and bogus rel=spammer linkfarms that point to legitimate sites.

If the service does it right, rel=spammer abusers should end up with negative scores. By making a link farm linking to this site using the rel=spammer tag, you might even end up boosting my PageRank by trying to abuse the tag.

So, thanks for the feedback -- but I still think rel=spammer is a good idea.

Well I think 'smartass' is verging on making a good point there. Google would have to tread very carefully implementing any mechanism which *lowers* pagerank, whether it's rel=spammer links, using external spammer blacklists, or setting honey-pot traps, because the bad guys will not only abuse this in order to make the scheme a failure, they will abuse this to their own benefit, by lowering the pagerank of all their competition.

There's really no easy solution for google to implement, because if they implement anything with teeth (a noticable drop in rank for spammers), then the spammers will immediately swap tactics and begin fake spam attacks linking to their competition. It would be easy, even now, to get your competitors listed as spammers, but currently there's little motivation to do so.

Post a comment