Open-Source Lucene Fits Site Search Alternative

searchboxThe free open-source toolkit Apache Lucene has been making its way into more Web content management systems, according to recent research from CMS Watch. Published findings in "The Web CMS Report 2009" suggest that 40% of manufacturers supporting CMS now bundle the app into their platform.

Prompting the surge has been the need for more advanced plug-and-play features in Web content management platforms, and growing awareness that Google Search Appliance and other hosted search engines that spider Web pages have limitations that are solved through open-source apps, according to Tony Byrne, founder of CMS Watch, Silver Spring, Md. "A search engine that understands the backend of your content repository can yield a much richer search experience for users as long as it has been carefully integrated," he said.

By understanding the structure of the Web CMS repository, the search engine can access more relevant data and deliver better results more easily than a third-party platform. Direct access to metadata can support an advanced feature known as "faceted search" or "guided navigation," which allows searchers to drill down through results. And, the search engine can support the kind of advanced features that Google has trained users to expect such as spell-checking and file filters if the CMS platform integrates optional Lucene modules.

The downside to using code from free open-source projects to build applications historically has been the lack of a development roadmap that keeps customers informed about when they can expect updates. Major open-source projects such as Lucene are more stable because many companies rely on it. Byrne said. Another issue that consumers must consider is Lucene is a toolkit, rather than a finished search project, so customers will need to rely on support from Web content management companies that embed site search into the platform.

The findings suggest that most companies aren't concerned that Lucene is open source because they rely on support from third-party Web content management platform providers such as Vignette and Interwoven that help companies manage information posted to Web sites.

Free site search engines don't necessary work best for all Web sites. But when Web content management companies take time to integrate the proper features and filters -- for example, those that index Adobe documents -- open-source add-ons such as Lucene can prove more cost-effective than site search tools from Autonomy, Google or Microsoft, Byrne said.

Next story loading loading..