Welcome | View My Profile | Sign Out
MediaPost Home About MediaPost Privacy/Terms Media Kit Sitemap
Publications Home News
Online Media Daily Media Daily News Marketing Daily Mobile Marketing Daily Search Marketing Daily
Daily Feed> Email Daily Feed> Video Daily Feed> Social
Online Blogs
Online Spin Email Insider Search Insider Behavioral Insider Online Publishing Insider Mobile Insider Video Insider Gaming Insider Performance Insider Metrics Insider Social Media Insider Just An Online Minute Daily Online Examiner Raw Blog
Media Blogs
Research Brief Diane Mermigas:On Media TV Watch TV Board Magazine Rack Media Creativity Notes From the Digital Frontier Digital Outsider Mad Blog Red White and Blog
Marketing Blogs
Engage:Hispanics Engage:Kids 6-11 Engage:Moms Engage:Boomers Engage:Gen Y Engage:Teens Marketing:Green Marketing:Sports
Magazines
OMMA Magazine Media Magazine
Subscribe
Feedback Loop RSS Feeds Archives Subscribe
Dec 2 Search Insider Summit (Utah) Dec 6 Email Insider Summit (Utah) Jan 11 OMMA Agency of the Year (NYC) Jan 12 MEDIA Agency of the Year (NYC) Jan 26 OMMA Social (San Francisco) Jan 27 OMMA Performance (SF) Feb 24 OMMA Metrics Measurement (NYC) Feb 25 OMMA Behavioral (NYC) Mar 15 OMMA Global (San Francisco) Apr 14 Search Insider Summit (FL) Apr 18 Email Insider Summit (FL)
Recently Concluded Events
Nov 3 OMMA Adnets (NYC) Oct 30 OMMA Video (LA) Oct 29 OMMA Mobile (LA) Oct 29 OMMA Mobile & Video (LA) Sep 23 Creative Media Awards (NYC) Sep 23 The Future Of Media (NYC) Sep 22 Online All Stars (NYC) Sep 21 OMMA Awards (NYC) Sep 21 MediaPost Live at Advertising Week All-Access (NYC) Sep 21 OMMA Global New York (NYC)
All MediaPost/OMMA Events Event Blogging Past Event Videos
Industry Events Calendar
2010 OMMA Agency of the Year 2010 MEDIA Agency of the Year
2009 Creative Media Awards 2009 OMMA Awards 2009 Digital Out-of-Home Awards 2009 Media Agency of the Year 2009 OMMA Agency of the Year
All Awards
Employment Situations Wanted Services Offered Post a Job
Briefs Reports Online
MediaPost Directories
Mobile Insiders Group
People Finder Edit My Profile View My Profile My Contacts My Calendar
HOME • MANAGE SUBSCRIPTIONS • MEDIA KIT
Commentary
The Newest Front in the Online Wars: Splogs
by J. Scott Johnson, Tuesday, September 20, 2005, 6:00 AM

SHARE

TOOLS

RELATED ARTICLES

MOST READ

You would have to be a denizen of the blog world to understand what a "splog" is, but it won't be long before it enters your lingua franca in the same frame of reference as spam, click fraud, and spyware.

A splog is a spam blog--that is, a fake blog that is created for the sole purpose of getting a high search engine "page rank" to reap profits through ad clicks, or to drive customers to an otherwise obscure e-commerce site. Just like e-mail spam, splogs don't take a rocket scientist to create, but can be built by simple automatic scripts or programs that abuse services like Blogspot, Moveable Type, Wordpress, or Google's Blogger.com.

To keep itself alive, a splog will crawl the Internet using directories, search engines, RSS feeds, etc., collecting information to give the appearance that a real person is adding content. In many cases, this involves automated "theft" of original and often copyrighted content from other authors, without their knowledge, permission, or even attribution.

There are lots of different kinds of splogs that have different ways to disguise themselves as real blogs, but commonly they contain key search terms repeated dozens or even hundreds of times. One researcher did a test on a "Dance teaching" spam blog, where the word "dance" was found 948 times on a single page. The total number of words on the page was around 2048. That means half of the page was "dance." Splogs often send any human visitor to an entirely different site, either through clickable links, or the more annoying practice of automated redirects.

To give you an idea of the magnitude of the problem, in the United Kingdom there is a company with over 15,000 spam blogs at last count. There were well over 10,000 spam blogs on BlogSpot alone related to the Triple Crown horse races. Of course, each time a visitor clicks on a paid search term, the advertiser pays for it and the "splogger" gets a revenue share.

Sploggers try to defend their actions by saying it should not matter to the advertiser where users find ads, as long as they are clicked on. Most advertisers are very concerned about the environment in which their ads appear, and would not only be surprised by traffic from splogs, but upset by most of it. It is the equivalent of having your ad sold into The New York Times, only to have it show up in some penny sheet in North Dakota.

From the reader's perspective, it is a serious issue as well, as most would prefer to read a story from the original source, not replicated out of context and certainly not with non-sequitor search terms inserted randomly into its text. Even more troubling is the prospect of the "infoblog" overwhelming the "individual" content to the point where the Blogosphere more closely resembles late-night TV than a forum for thoughtful discussion and exchange of information.

Blog search companies must maintain an aggressive stance on blog spam, and continue to hone their tools and techniques. Developing and deploying anti-spam tools for e-mail makes it clear that combating blog spam needs to be part of the search company core, not an afterthought or add-on.

Building on years of experience with cataloging blog posts, Feedster has implemented and is continuing to refine an integrated, multi-layer approach to quickly and accurately responding to spam. A handful of the "tricks" of e-mail spam identification are applicable to the world of blogs, but the problem is compounded, as sploggers don't have to use "email blast" software with its unique "signature" on the tens of millions of e-mails, but can use the "real" software to make their blogs.

Past the obvious looking for "Viagra" and all its misspellings, all search engines should employ a sophisticated approach to detecting blogs that are reasonably certain not to contain original content or commentary, past that intended solely for consumption by search engine crawlers. Blacklists of known splog domains are a good start--for eliminating their content, as well as those that refer to them to raise their mutual page rank. The way a blog is published can also tell you a lot about its source. For example, it wouldn't be surprising if there were absolutely no blogs published using WordPress on the .info domain with Google AdSense that weren't splogs. There is also the "human factor," both in terms of "I know spam when I see it" and in quickly responding to the inevitable misclassification of a blog as spam.

Can the war on splogs be won? No--there will always be those that try to "game" influential search engines. Just like e-mail spam, there is a balance between a pristinely spam-free result that discards some valuable content with an approach that eliminates the bulk of the spam, but makes sure that you don't miss anything important. In this rapidly changing world, Feedster is fighting to make sure that you can find the tiny gems of meaningful, timely content without having them be swept away with the splogs.

J. Scott Johnson is co-founder and chief technology officer of Feedster, Inc.

2 people recommend this article. 

Leave a Comment

You must be signed in to comment. Sign In



ARCHIVES

Recent Online Media Daily Articles
Rotten Apple    
If you have ever been to an Apple retail store, everybody there seems to have ingested...
Trust is a Beautiful Thing   
Why do people pay $11 for turkey sandwiches at Whole Foods? Trust. And social media is...
How SMS Advertisements Will Impact Consumers   
Mobile advertising offers brands an unprecedented ability to build highly targeted, personal relationships with their audiences....
Why the Real-Time Web, Social Networking And Android Drove Google's AdMob Acquisition   
It's a great time to be a mobile ad network, but not for the reasons you...
The Ultimate Fast Guide to SEO + Flash   
Superb digital presentation is the synergy between art and technology. Nowhere is the fluidity of this...
They Save Whales, Don't They?   
A freelancer who wrote a story in The New York Times' Science section had his expenses...
MLB: More Polish, Less Spit?   
When you have a 14-year-old-daughter, you quickly learn that her career choices tend to change faster...
Trick Or Click! Why Are Advertisers Letting Themselves Get Tricked?   
Trick or treat! Before I open the door to hear that perennial cry, I often wonder...
Modeling Your Way To Better Campaign Results   
With the recent New York Times article stating that Statisticians were the "next sexy job", although...
Is the Internet an Economic Glass Half Full or Half Empty?   
Wishing the Internet a happy birthday, Tom Foremski, in his Silicon Valley Watcher blog,...
>> Online Media Daily Archives 
ABOUT MEDIAPOST • MASTHEAD • MEDIA KIT • RSS FEEDS • PRIVACY/TERMS & CONDITIONS
©2009 MediaPost Communications. All rights reserved.
1140 Broadway, 4th Floor, New York, NY 10001
tel. 212-204-2000, fax 212-204-2038, feedback@mediapost.com