Commentary

SAS Supercharges Sentiment Analysis

numbers

One of the most interesting things about social media, in my humble op-ed, is the way enterprising nerds have begun to apply quantitative analysis to the uber-complex world of human sentiment -- basically marrying math and emotions. The most recent example I came across is a new social media analytics tool from SAS. But first I must digress about why I find this stuff so fascinating.

Obviously people have always been interested in what other people think, and it's not just a human trait: social dynamics of one sort or another are evident in pretty much every animal species, and like all natural phenomena these dynamics are governed by laws which can be described mathematically. Humans are no different, although we do throw a big wild card into the game: language, which allows us to make a far wider range of statements -- observations, exclamations, abstract theorizing, counterfactual speculation, questions, insults, etc. Through writing, language also allows us to record our thoughts for the ages. This is especially true with the Internet, where even offhand comments on marginal matters can live forever.

I am always amazed when I think about the amount of verbiage and related content crawled by Google's "spiders" every day, and it's even more incredible when you realize that the vast body of text constituting the Internet is constantly growing and changing. On one hand, it's a vast archive of statements about just about every topic imaginable, which avoids some of the pitfalls of "analog" sentiment tracking tools like opinion polls and focus groups -- including the inevitable bias introduced by the mere presence of a questioner. On the other hand, the vast majority of these statements are frankly irrelevant to a specific marketing goal, and there's no "card catalog" like an actual curated archive. To make things even more difficult, relevant data is further obscured by things like fake blogs.

But the biggest barrier to understanding what people are saying, in my humble op-ed, is probably the mind-bending variety of ways that language is used. Not least, this includes mistakes like misspellings, which have given rise to a whole branch of search engine marketing, but also the correct use of ambiguous words. Take almost any word and you can adduce multiple meanings. For example, how about "word" itself? It can have the meaning I just used -- the basic unit of language -- but it could also be an ironic old-school exclamation of approval, as in "word!" In a Bible chat group it could be shorthand for the Word of God. It can refer to a promise -- "I give you my word" -- or two conflicting assertions about reality: "It's your word against his."

Given all this subtle variation and ambiguity, I'm continually amazed at how good social media analytics tools are at distinguishing the actual intent and meaning of words used by individuals online -- and more to the point, how quickly they continue to improve these capabilities. Which brings me to the new "data mining" tool unveiled by SAS today. SAS says the tool, Social Media Analytics, is based on a natural-language analysis engine created by Teragram, a company which SAS bought in 2008 and which specializes in "linguistic technologies" and artificial intelligence. I don't know much about Teragram, but from its location (Cambridge, MA) I'm guessing some sizeable brains from Harvard and/or MIT are involved.

According to a blog post on the New York Times, analysts who tried the new SAS analytics tool praised its superior ability to divine sentiments based on online statements. Specifically, they said the SAS tool comes closer to approximating a human reader's ability to discern whether a blog post or tweet, for example, is positive or negative. Where most sentiment analysis software is 70% accurate on this front, the NYT quotes one analyst who used the SAS analytics tool as saying it's about 92% accurate.

But that's just the beginning, because once marketers have a firm grasp on sentiment, they can begin zeroing in on the most influential individuals discussing their brands (both positively and negatively) in an attempt to exert some beneficial control over the online discourse. This is accomplished by tracking, for example, blog links and re-tweets -- something which social media analysts could do before, which should however be more productive now that they're armed with more accurate data about the opinions being expressed through these channels.

2 comments about "SAS Supercharges Sentiment Analysis".
Check to receive email when comments are posted.
  1. Joe Buhler from buhlerworks, April 12, 2010 at 8:33 p.m.

    The more tools like this will be developed the better. They can only help making social web marketing more effective and useful. The often asked question of ROI will also be easier to answer which will make spending on these activities easier to get from the ever stingy CFO crowd.

  2. Dave Linabury, April 20, 2010 at 12:47 p.m.

    Automated sentiment at 92%? Hard to believe, although it would depend greatly on the subject matter. Some words can be extremely difficult to parse as the same word can have opposing meanings in different industries .

    Here's some examples of opposing words for sentiment analysis I use frequently in my presentations:

    1) TOUGH: If you make jeans, tough is a positive word. If you cook steaks, not such a good word.

    2) THIN: Great if you're talking about a smartphone. Terrible if you're talking about hotel walls.

    3) SH*T. If someone says "your product is sh*t", that's obviously quite negative. But change it to "your product is THE sh*t" and suddenly it's highly positive.

    You can see that even common words we use every day can be tricky for a machine. Context is helpful, so is knowing what industry you are discussing.

Next story loading loading..