Google reported moving its automatic speech-recognition and closed-captioning technology out of beta Thursday, making it available to the YouTube community. The company first announced the
auto-capturing technology in
November with an aim to make the visual clips more accessible to the hearing
impaired or anyone searching for videos online, including search engines.
Auto-captioning combines some of the speech-to-text algorithms found in Google's Voice Search to automatically
generate video captions when requested by a viewer. A "request processing" button for uncaptioned videos allows any video owner to click to speed up the availability of auto-captions. It takes some
time to process all the available video.
Google has plans to broaden the feature with more languages in the coming months. Still, automating captions is not a perfect science, and the owner of
the video needs to check to make sure they are accurate, Hiroto Tokusei, YouTube product manager, explains.
The technology has been around for about 50 years, and optimizing videos for search
engines has always proved challenging. aimClear founder Marty Weintraub says Google crawled the captions and pulled text into Google organic SERPs (search engine results pages) in the past, and Chase
Norlin, chief executive officer at AlphaBird, an online video syndication company, sees no reason for Google to stop now.
Norlin explains that the auto-captioning technology provides more
metadata for search engines to discover and index the clips. "I won't be surprised if they find a way to index ratings to provide even more metadata as people increasingly access these videos from
their TV," he says, telling me about his recent Apple TV purchase and how he watches YouTube videos on the big screen.
"The more metadata associated with the video, the easier it will become for
engines to grab the content and for people searching to find it," Norlin says.
While YouTube launched the service as a test last year, it wasn't the first to introduce this type of technology. In
May 2009, the Technology, Entertainment and Design Conference, better known as TED, introduced the TED Open Translation Project, which
brings information to the non-English-speaking world by offering subtitles, interactive transcripts and the ability to translate any talk by volunteers.
Working with the multilingual subtitling
service dotSUB, TED implemented a system to coordinate and automate translations of individual videos. Spanish is the language with the most translated talks to date.
An example of
auto-captioning: