What is unstructured data in today's world? It is images/objects, text and other data types that are not part of the typical database. An email is actually considered unstructured data -- although the message itself is part of a database system (Lotus, MS Exchange) - with the body of the message free form without any structure to it. A raw document is another piece of unstructured data. Translated to components that mean something to marketers, sources of unstructured data include video, audio, blogs, Twitter, PowerPoint presentations, and more.
The unstructured vs. structured conversation within the context of marketing has two sides:
1. Unstructured data is a consumer challenge that businesses must solve. For years, we have searched through unstructured information to retrieve results. Recall the high school paper you wrote 20 years ago and how you searched for library books by typing in topical phrases. This has become second nature today through search engines. The technology has advanced dramatically, so that you can retrieve all sorts of content (from PowerPoints, video, audio, articles, etc.).
Yet many studies indicate we only use simple methods to leverage search. The average search term is 1.5 -2.5 words, and only 10% of searces leverage Boolean operators ("and", "or" "not"). Sadly, according to research by the U.S. National Institute of Standards and Technology, when 2.5 search words are used, only 23-30% of documents returned are deemed relevant to the query. The challenge is that, while data is increasing significantly, search is typically considered an independent request rather than a contextual request. Younger consumers are much more sophisticated in efficient search, and will pressure businesses to keep pace.
You might be thinking: this doesn't mean much to me, I don't run search. Yet the symptoms have an effect on overall marketing. Consumers are talking, expressing and generating content in new forms at a much more rapid rate than in the past. The core platforms we have in place to capture these moments of truth and derive context to this information are becoming amazingly complex to make decisions around. Businesses will be challenged to choose where in the enterprise they decide to solve the problem, in order to get the greatest return on the effort. They will be challenged to understand which unstructured sources are valuable and which are just expenses.
2. Unstructured data will redefine research, product development, and many aspects of direct marketing. Unstructured data won't solve anything for the entire marketing ecosystem anytime soon. The shear mass of data, decay rate of data, and applied context are critical to understanding what part of the data to use. One such area that will thrive is product research. All observation-based research has bias, based on the shear nature of how research is administered. We now have focus groups that are unfiltered, unbiased, and that can't be influenced by structured field research methodology.
It took 10 years for online surveys to "arrive" as a viable method of field research. I believe the future of unstructured data for marketers will drive up through research and through programs designed to aggregate comments, thoughts, opinions and expressions in a very contextual way -- and will accelerate much faster than the adoption of online surveying.
Today, a lot of data (from documents to community forms) lives inside a company, and a great deal lives in the public and private domains of the social ecosystem. I believe the brands that will emerge will harness the "private" domain concept and build their own research, customer service, and infrastructure that will leverage very context-oriented values for the consumer. They will have better search capabilities, more contextual interactions across the enterprise, and better migration from product to product through well-structured and timed product releases. And they will tap into the billions of dollars companies spend in researching new product innovations -- with a more efficient methodology for the go-to-market process. Companies will need to make sound decisions around data integration and what areas of unstructured data should be their core competencies.
We live in a linear world today. The dimensional enterprise had better get prepared for this soon, because it will be how we operate in the future. If you are only thinking about unstructured data in terms of a Tweet and how you might leverage that for response-based or direct marketing efforts, take a step back and think about the future of how you will engage and sell in the future, and how content and context will drive that. Think video and documents, not "comments."
Interesting post. I often get asked about what the next thing will be in social media and social media ROI. This is definitely one of the upcoming trends.
Text analysis is only scratching the surface. There is much more to come.
Beware the homophone. You use "shear" twice, when I'm sure you really mean "sheer".