Yahoo is forming a spinoff with venture firm Benchmark Capital, based on the Web portal's pioneering development work with the open-source Hadoop software, used for managing massive amounts of data.
The technology has gained ground with the expansion of cloud computing, adopted by major Internet and high-tech companies including AOL, Amazon, Facebook, Netflix and IBM.
It is intended to allow organizations to more efficiently and cost-effectively store, process and analyze the ever-growing volume of data being created and collected online each day. In partnership with Benchmark, Yahoo aims to commercialize its Hadoop technology in the form of a new company, called HortonWorks, that will provide support and services to Hadoop users.
"This investment will enable Apache Hadoop to meet the growing market demand and become the big data management and analysis platform of choice for the industry," the companies stated, without disclosing the amount of initial funding behind the project.
After funding early prototypes of Hadoop in 2005, Yahoo began using the technology in its data centers, and now applies it to the pricing and forecasting of its online ad system, delivering billions of ads daily, per The New York Times. It operates on 42,000 Yahoo servers; every four days, the company's servers store the data equivalent of the entire Library of Congress.
HortonWorks promises to help companies connect thousands of servers to process and analyze data at supercomputing speed. However, tech blog GigaOM points out that the fledgling company will compete with Cloudera and EMC, which are already offering commercial Hadoop-based products and services. Cloudera itself has raised $36 million to date.
"HortonWorks will have to ensure it advances Hadoop development across industry lines and not just in a manner optimized for Yahoo's Web scale needs if it wants to gain adoption," noted GigaOM, which broke news of the Hadoop spinoff Monday before Yahoo's formal announcement today.
The new company will be headed by Eric Baldeschwieler, formerly vice president of software engineering for the Hadoop team at Yahoo, and will launch with a small group of key Hadoop contributors. Yahoo will be its first customer.
Hadoop creator Doug Cutting named the software after his son's toy elephant. Continuing that theme, HortonWorks takes its name from the Dr. Seuss children's story "Horton Hears a Who!" Hadoop started as an open-source project of the nonprofit Apache Software Foundation.