Web IR & IE


Working Groups

Mailing Lists
Related sites

Free for all IR & IE software:
  • Lucene - The Apache Lucene project develops open-source search software: Lucene Java provides Java-based indexing and search technology; Nutch builds on Lucene Java to provide web search application software; and Lucene4c which is a C-based search engine compatible with Lucene Java, built on the Apache Portable Runtime.
  • Google API - use Google's indexing & search services to build your own application. Also -- have a look at the related book Google Hacks by Tara Calishain & Rael Dornfest
  • MG - the system described in the Managing Gigabytes book
  • Glimpse & Webglimpse - indexer, spider, and manager for web files to be searched
  • ht://Dig - a complete WWW indexing and searching system for a small domain or intranet
  • SWISH-E - indexing collections of Web pages or other files
  • WIRE - a Web Information Retrieval Environment (GPL, in C/C++)

Free for all text/web files collections: