Marketers may not think of IBM as driving the next vision for search services, including advertising and marketing, but the cognitive technology behind the next generation of search engines sits in Big Blue's grasp.
IBM has released several applications that will help move search services into the next phase as voice search replaces the act of typing keywords into a search box, and the need for something in a specific moment in time replaces intent. Humans will express a need and devices from cars to inanimate objects like refrigerators will respond in a more natural way, according to IBM VP of Watson Core Technology Jerome Pesenti. Many of these platforms will allow consumers to interact with Internet-connected devices.
As people ask new questions, systems, including search engines, will identify the question as new, analyze the language, and learn from previous responses, Pesenti said. He referred to Watson's debut on Jeopardy after asking whether computers will become smarter than humans. Some facial and image recognition systems today have an error rate similar to humans, whereas five to 10 years ago it wasn't possible. The error rate is 8% for computers, recently down from 12%, compared with the human error rate at 5%.
Computers are good at narrow tasks, such as natural language, speech and visual recognition, and questions and answers in search, but technology doesn't match general intelligence of humans. In 10 to 20 years, computers will assist in specific tasks, not replacing humans, but humans will more readily have the tools to develop many applications for cognitive tasks such as self-driving, diagnostic, and speech.
IBM has released into general availability some application program interfaces (APIs) -- IBM Watson Language Translation, IBM Speech to Text, and IBM Text to Speech -- to support developers as part of the Watson development platform's expansion efforts. These APIs are used by more than 100 companies such as USA Bank.
The APIs can translate news, patents, or conversational documents across several languages; produce transcripts from speech in multi-media files or conversational streams, capturing vast information for a myriad of business uses; and make Web, mobile, and Internet of Things applications speak with a consistent voice across all compatible platforms.