Commentary

Real-Time Search Fills Internet With Garbage

There's no such thing as real-time search. Without the context behind the status updates on Twitter and Facebook, the characters and words strung together in semi-quasi-sentences reflect a bunch of data points -- or garbage in an endless chain of gibberish.

Now that Google, Microsoft, Yahoo and others have figured out a way to make the Internet come alive as events happen, the next challenge will become filtering the garbage that real-time search creates.

The most interesting part of real-time search will materialize when engines develop a method to track behavior and serve up ads based on the intent people expose in real-time search queries. Being able to link ads to that moment in time will give brands incredible insight into consumers and another measurement to add to the marketing funnel.

In a chat room poll asking how search engines will filter the garbage coming from Twitter tweets and Facebook status updates in real-time search, one SEO professional who answers to the screen name "Contempt" tells me "there will always be a way around it, always a way to abuse it, always a way to get in it. And, if it makes money, people will find a way to abuse it." Another SEO expert went as far as calling real-time search "bullshit."

Don't get me wrong. Catching a glimpse of important news trending on Twitter about the release of Apple's tablet or an earthquake in Haiti can provide the first look into breaking news, but people who search the Internet for information want to trust the data they sift through, especially the mounds of tweets sent through Google Alerts.

David Dague, Localeze vice president of marketing, tells me search engines have identified, but not solved, the problem of filtering the garbage to find the "value" from Twitter tweets and Facebook posts. "Trying to verify the source of the information is the first challenge," he says.

The debate rages on about the value of real-time search. Some argue that it adds character and human perspective to a topic, while others believe it fills the Internet with false facts and garbage.

Mark Drummond, chief executive officer at Wowd, a real-time search engine whose company mantra is "discovering what's popular on the Web right now," agrees the issue has two sides.

"First, what are you searching through when you issue a search query: tweets, blog posts, the entire Web? If you're only searching through the space of tweets, then you're pretty well guaranteed to get a bunch of phatic social commentary that just happens to match your search query," he says. "If instead you're searching through the entire Web, and all the pages that the global Web contains, then you've got a much better chance of finding some quality content."

The second issue related to real-time search "garbage" is ranking, Drummond explains. Even in real-time search, it's important to actually rank content according to quality and relevance to the query. And while time is an important factor of such a ranking calculation, it's not the only factor.

That's fine, but as one panelist during Tuesday's OMMA Social panel "Debate: Is Real-Time Search for Real? Or Just Another Tweet in the Pan?" put it, only Google knows where to find the end of the Internet.

Moderating the panel, Uwe Hook, Direct Partners senior vice president, told attendees that more people died on Twitter in 2009 than actually did in real life, demonstrating the point that one shouldn't trust all the information in tweets.

I agreed with Gregory Keller, vice president of product management at lijit, which focuses on social graphs, who says Twitter is a black hole that consumes time. The OMMA Social panelist told attendees information isn't generated through the microblogging site, but rather data that half the time produces a nonsensical mess that clutters the Internet.

"There's no such thing as real-time search," Keller says, calling real-time search engines "content generators," or data; only when a human can synthesize the post does the data become useful.

Twitter's 140-character messages provide a window into the information, but can't possibly share it all. The site has somewhere between 25 million and 30 million messages streaming daily, according to Gerry Campbell, chief executive officer at real-time search engine Collecta, citing a variety of sources. "Marisa Mayer of Google says there's somewhere around a billion-plus new pieces of real-time information being published every day," he adds. "It's important to keep those things in perspective."

Campbell believes Twitter provides society with an incredible tool to access instant information, but you can't always express everything in 140 characters. The industry will need filters to sift through the garbage generated from real-time search, something that Ben Carlson, president at Fizziology, knows all too well.

Watching real-time search chatter works for Carlson. His company, which monitors social media metrics for the entertainment industry, uses real-time search to monitor buzz on movies. "Just looking at the real-time feeds of everything being tweeted about is kind of meaningless," he says. "You have to just look for opinions, and how those opinions get shaped and change over time."

4 comments about "Real-Time Search Fills Internet With Garbage".
Check to receive email when comments are posted.
  1. Nelson Yuen from Stereotypical Mid Sized Services Corp., January 27, 2010 at 4:51 p.m.

    I think users in the early stages of "real-time" search will distinguish for themselves whether they want social media information, real-time information, or the web.

    With that in mind... I'll jokingly pose a scenario.

    The key to making query results relevant to the end user is to know intent prior. What better way to do it then to ask.

    I think in the early stages of "real-time," the engine shouldn't be very pro-active. Instead, a very soothing "Lion King" voice should ask me what my intent was.

    If an earthquake happens in California today, and I make a search query for "CALIFORNIA EARTHQUAKE," maybe the search engine should come up with 2 sets of data and ask me:

    Are you looking for an earthquake that happened today?

    Are you looking for earthquakes that have happened in California?

    None of the above.

    Then filter real time information according to my answer.

    (Just a joke.)

  2. Abhijit Sahay from TipTop Technologies, January 27, 2010 at 8:20 p.m.

    I agree with Carlson: opinions of real people is what makes real-time data so useful. Keller should see through the engines that he (rightly) calls "content generators" to the real generators of content -- the people. Those who wonder why one should care about the opinions of people we don't know should ask themselves if they never went to a movie just because others said it was good. Tools to deal with opinions are coming up (see www.feeltiptop.com) it is only a matter of time before people see the value that is literally streaming by them.

  3. David Shor from Prove, January 28, 2010 at 12:15 a.m.

    Hint to us digerati: It's not about a one-size fits all filter like Google. It's about us having our own filter set that lets in only those things we are open to.

    The 90's "agent" software had it right.

    The '10's software will reintroduce the personalized agent but also have a section where "discovery" will take place.

    Who's building one?

  4. Jerry Foster from Energraphics, January 29, 2010 at 4:49 a.m.

    It depends. If there is a trending topic you feel is interesting, you place a tab in your browser opened to search.twitter.com to monitor it. You can beat Google use to the latest that way. I don't use blog search engines anymore because I assume all interesting new blog postings will be tweeted (false assumption probably, but bloggers need to get with the program and micro-blog announcements of new blog posts).

Next story loading loading..