Marchex Speech Analytics and Watson were evaluated through a series of tests to directly compare their capabilities, identifying two key metrics to measure speech technology performance: RAW word error rate and perceived word error rate, ensuring fair testing for both systems.
The English language can be tricky. Words like "alright" and "all right," depending on the use, can trip up an automated analytics platform.
Speech Analytics aims to help marketers optimize media spend by seeing a visual representation of the purchase funnel in a transcription of the audio conversation between a consumer and customer service representative at a company. It analyzes a variety of interactions such as lost opportunities, calls where the customer failed to complete the transaction, or high-intent calls to determine traffic sources driving sales.
In the platform, RAW word error rate measures the number of words that are inserted, deleted and substituted to discern overall accuracy of a transcription. The perceived word error rate normalizes some words and spellings that are really the same, but in the RAW case would appear as an error.
For instance, a human transcription of a recording might record the word as "alright," but the machine transcription showed "all right," the RAW system would consider that an error while the perceived system would not.
Marchex achieved a RAW word error rate of 15.7%, compared with Watson's RAW word error rate of 21.1%. Similarly, Marchex demonstrated a perceived word error rate of 14.6%, compared with Watson at 19.4%.
Jason Flaks, senior director of product engineering, Speech Analytics at Marchex, said about 7,000 previously unseen spoken utterances from real phone calls -- approximately five straight hours of random conversation -- were submitted to multiple speech recognition systems including Marchex Speech Analytics and IBM Watson. The transcribed output from each system was then compared with a human transcribed source, and then the industry standard Word Error Rate metrics were calculated into the mix.
It didn't matter how many times Marchex performed the test, Flaks said. The test returned the same results every time with no error. "We have run the test dozens of times," he said, adding that the tests were intended to be random.