Twitter Combines Human Factor, New Tech To Refine Real-Time Search
Twitter has integrated human intelligence to help evaluate search results, such as the meaning of trending hashtags and why specific topics relate to intent.
The real-time computation system supports search to help identify keywords at the moment they begin to trend -- something missing in the world of real-time results. Making this possible are several technologies, such as Scalding, Storm and FlockDB.
Human intervention, coupled with theese technologies, will help the company understand what people think rather than type. Machines will interpret intent through words and timing, and search marketers will work with artificial intelligence.
The process begins with a technology called Storm, which processes streams of new data and updates in real-time, rather than in batches for update at a later time.
Behind the scenes, the Storm topology detects when queries reach sufficient popularity, while automating a query to a Thrift API that dispatches it to Amazon's Mechanical Turk service, and then polls the platform for a response. The queries get passed to human evaluators to analyze the outcome.
They push the information to Twitter's back-end systems, so that the next time a user searches for a query, the company's machine-learning models will make use of the additional information.
Twitter engineers Edwin Chen and Alpa Jain say the company relies on crowdsourcing, described as a "small custom pool of Mechanical Turk judges to ensure high quality."
Twitter has considered other possibilities for crowdsourcing, such as testing the use of in-house judges, standard worker filters that Amazon provides, and going through an outside company like Crowdflower. The two explain how Mechanical Turk judges work best.