Terrier is software for the rapid development of Web, intranet, and desktop search engines. More generally, it is a modular platform for building large-scale information retrieval applications, providing indexing and probabilistic retrieval functionalities. It comes with a desktop search application.
| Tags | Information Management Internet Web Indexing/Search Software Development Libraries Java Libraries Text Processing Indexing |
|---|---|
| Licenses | MPL |
| Implementation | Java |
Recent releases


Changes: This is a substantial update, which includes new support for Hadoop, primarily a Hadoop Map Reduce indexing system, allowing large collections of documents to be indexed in a highly distributed fashion. Also included are various minor improvements, including improved support for the IIT CDIP1 (TREC Legal track) collection, and various bug fixes. This is intended to be the ultimate release in the 2.x series.


Changes: This is a minor update that contains some bugfixes, and some minor improvements. Support for indexing various test collections has been improved (CLEF and TREC Legal track), and the flexibility of the settings of some applications such as the Desktop Search and Interactive Terrier has been enhanced. This release includes a filesystem abstraction layer, which allows various types of files to be accessed through a uniform API. For example, indexing an HTTP Web page is as easy as indexing a local document. Moreover, a notable indexing bug affecting only the Windows platform was resolved.


Changes: This is a major update that integrates a new faster indexing architecture contributed by the University of A Coruna (Spain). Other enhancements include the new Divergence From Randomness weighting model (DFRee) from Fondazione Ugo Bordoni (Italy), which provides robust performance without the need for any parameter tuning.


Changes: Minor update. Mostly bugfixes. Some minor code enhancements, plus the inclusion of a test harness. Snowball stemmers were added to boost support for languages other than English.


Changes: This is a major update with improvements in indexing and retrieval functionalities, including faster indexing and retrieval, and new retrieval models (including models from Divergence from Randomness and Language modeling). It has support for much larger collections of documents, including TREC GOV2 collections (25M documents), merging of indices, and multi-lingual and non-English collections of documents. The documentation has been vastly improved.