IBM Builds Search Engine, Redefines Analytics Tools

IBM has created a search tool that allows North Carolina State University to crawl through massive amounts of Web data on blogs, forums, reports, industry related news portals and government Web sites. Similar to a search engine bot, the query gathers data and produces a short list of potential investors for projects.
The NC State's Office of Technology Transfer manages more than 3,000 technologies invented by students, facility and staff. The seven member staff typically manually searches the Internet looking for potential investors for projects to bring technologies to market.
"The analytics and the language tools take a user defined set of criteria and searches the Web," explains Billy Houghteling, director of NC State's Office of Technology Transfer. "For both pilots, we identified the sites and resources the tools needed to crawl. Both searched more than 1.4 million sites to find contacts. I can't fathom how long it would take for a member of my staff to do that type of exercise."
Historically, it would take between two and four months to identify a short list of potential investors. IBM's newly defined analytics "search engine" cuts that down to between 10 days and a couple of weeks. While the analytics tools validated the process, they also identified many new possible partners.
Developed in IBM Labs, the analytics technology-- BigSheets and Content Analyzer in the IBM Cognos analytics suite--used in the pilot crawls the Web and mines large amounts of unstructured data. The analysis, based on factors such as business relevancy, government policies, market needs and trends, cuts a time intensive and inefficient process.
While BigSheets, built on Hadoop technology, supports high-level ad hoc exploration of very large data sets, Content Analyzer provides sophisticated data analysis. Both tools offer a Web-based interface, but BigSheets provides a visualization feature highlighting the relationships between the data. Simply put, BigSheets keeps the data in its original format; ICA creates a data index while it scans the information.
Those using BigSheets would point the tool toward Web sites or data sources they want to mine and allow the application to collect the information. The person could then explore the data similar to a spreadsheet. Both tools crawl Web sites to find and collect data, but Content Analyzer indexes data and BigSheets parses it, storing bits and bytes in their original form.
Chris Spencer, emerging technology strategist at IBM, made it clear both tools follow Web site search and index guidelines presented in the site's robots.txt directives, so the tools are "friendly" crawls that follows the rules set by Web site owners.
NC State's has full use of the tool as they evaluate it, Spencer says. At the request of NC State, IBM continues to work with the university to determine other uses beyond requirements from the Office of Technology Transfer.
0 comments on "IBM Builds Search Engine, Redefines Analytics Tools".
Leave a Comment
Recent Online Media Daily Articles
-
Yahoo Search Experiments With New Look May 23, 6:30 p.m.
Yahoo Search has been experimenting with colors, features and layouts, as the company tries to determine ... -
Path Seeks Dismissal Of Wireless-Spam Case May 23, 5:07 p.m.
Mobile social network Path is asking a federal judge to dismiss a lawsuit alleging that the ... -
Amazon Appstore Goes Global May 23, 4:59 p.m.
Amazon may have been late to the app store game, but that hasn’t stopped it from ... -
Data Is Springboard For Product Development May 23, 4:44 p.m.
iProspect named Ben Wood to global president Thursday; he's tasked with growing the company's network and ... -
Vice, Twitter Partner For Mobile Show May 23, 2:14 p.m.
Simultaneously expanding its video and social strategy, Vice on Thursday unveiled #dailyvice -- a daily show ... -
MediaVest Database Charts Brand Experience, Social Media Impact May 23, 12:11 p.m.
After a year-long research effort, Publicis Groupe’s MediaVest has created a massive database designed to help ... -
Discovery Launches TestTube.com, Ups Digital Video Involvement May 23, 11:27 a.m.
Discovery Communications is looking to get into digital video platforms in a big way -- launching ... -
Network Advertising Initiative Proposes New Mobile Privacy Rules May 22, 9:03 p.m.
Moving forward with its plan to issue mobile privacy rules, the self-regulatory group Network Advertising Initiative ... -
Entertainment, Travel Bet On Mobile Banners May 22, 4:16 p.m.
Banner ads have long been the whipping boy of online advertising, and the same is now ... -
Marketers Should Tailor Specific Pitches To Tablet, Smartphone May 22, 2:51 p.m.
Don’t lump tablets in with mobile. That’s the takeaway of a new Forrester study looking at ...


This looks like a piece of technology that could be useful to those doing family history research, criminal search or defense. Crawl the web for names looking for contextual information that relates to the those names.