Commentary

How To Automatically Reduce Duplicate Web Pages

Joelle-KaufmanBloomReach this week launched an algorithmic de-duplication technology that detects and reduces duplicate pages on Web sites. It does so without manual intervention, reducing about 95% of pages with duplicate content, increasing the signal-to-noise ratio and ensuring that the most relevant pages get found.

The platform, BloomReach Dynamic Duplication Reduction (DDR), is part of the BloomReach Web Relevance Engine through the BloomSearch Big Data Marketing Application. Companies without the platform need to use rel=canonical tags pointing to the canonical page. Without them, brands have duplicate content across the Web.

Joelle Kaufman, head of marketing at BloomReach, said sometimes companies want to have specific URLs for different affiliates to track traffic. That's good for tracking, but not if all the pages get indexed because it wastes crawl and cloud quota from search engines.

There's not enough space in the computing world to store and index every page, so everyone gets a quota based on relevant content, best practices, and more, Kaufman said. If the pages don't index in engines, they don't exist in natural search results.

Duplicated content on Web sites remains a common problem that attributes to lost revenue. "So, why would you want to waste any of that quota on duplicate pages?" she said. "We consolidate all links to point to the canonical page without requiring marketers to do anything. It doesn't mean the other pages don't exist. It just means the Web crawlers don't index them."

Mobile works similarly, but it becomes irrelevant faster because of the size of the screen and the tolerance of the user.

Kaufman said BloomReach DDR also allows client SEO resources to spend more time creating compelling pages, writing new content, ensuring that tags are descriptive and accurate and, when desired, adding rel=canonical tags.

1 comment about "How To Automatically Reduce Duplicate Web Pages".
Check to receive email when comments are posted.
  1. les madras from Campbell Commerce, October 9, 2012 at 12:41 a.m.

    This looks like a solution looking for a problem. But it did get me curious as to what BloomReach was all about and I found them to be another SEO outfit creating spam web pages to hijack users from search engines. Like this one for "Yellow white scarf"
    http://www.modcloth.com/th/yellow-white-scarf. What is a "yellow white scarf?" Evidently any color but yellow or white.

    Another technology that the world could do without!

Next story loading loading..