Harvest is a system to collect information and make it searchable using a Web interface. It can collect information using HTTP, FTP, NNTP, and local files. Supported formats include HTML, DVI, PS, fulltext, mail, man pages, news, troff, WordPerfect, C sources, and many more. Adding support for new formats is easy due to Harvest's modular design.
| Tags | Text Processing Indexing Internet Web Indexing/Search Z39.50 |
|---|---|
| Operating Systems | Unix |
| Implementation | C Perl |
Recent releases


Changes: This release features search time display in zquery.pl and updated components. It is a further step in replacing the default full text engine with Indexdata's Zebra.


Changes: This release improves the integration of Zebra as a full text engine and features some simplification in broker administration and creating new gatherers.


Changes: This release features a Russian translation of the user's manual, bugfixes in the PowerPoint summarizer, an improved SOIF to XML filter, a fix for a crash bug in essence, and an improved user interface. The broker now can have duplicate data under different URIs.


Changes: This release adds Dutch, French, Italian, and Swedish user interfaces and support for PowerPoint presentations. It also features portability improvements for FreeBSD, and fixes compilation issues with GCC 3.3.1.


Changes: This release adds a Dutch user interface and addresses gcc 3.3.1 compilation problems.
An OpenOffice.org plugin with enhanced forms, autotext, and printing features.