Projects / Managing Gigabytes for Java

Managing Gigabytes for Java

MG4J is a highly customizable, high-performance, full-text Java search engine for large document collections. It provides state-of-the-art features (such as BM25/BM25F scoring) and new research algorithms.

Tags Internet Web Indexing/Search Text Processing Indexing Software Development Libraries Java Libraries
Licenses LGPL
Operating Systems OS Independent
Implementation Java

Tweet this project Short link

Rss Recent releases

  • Rrelease-mid
  •  06 Jun 2009 23:36
  • Rrelease-after

    Changes: Major improvements were made to indexing. Lightweight compressed-collection construction was added. A skipping system with variable quanta was added. Memory mapping is used for large indices. Many bugs were fixed.

    • Rrelease-mid
    •  29 Feb 2008 10:23
    • Rrelease-after

    Changes: All new stemmers from Snowball were generating empty strings, causing major indexing problems. This has been fixed.

    • Rrelease-mid
    •  06 Jul 2007 10:35
    • Rrelease-after

    Changes: This release has a new, high-performance index format, several optimizations, indices with arbitrary payloads (dates, integers, etc.), faster minimal perfect hashing, new operators, and better algorithms.

    • Rrelease-mid
    •  08 Jan 2007 10:32
    • Rrelease-after

    Changes: Significant speed improvements in index writing and query resolution. A few bugfixes.

    • Rrelease-mid
    •  19 Sep 2006 09:23
    • Rrelease-after

    Changes: A few important bugs that appeared in 1.1, as well as a very old one, have been fixed.

    No-screenshot

    Project Spotlight

    mbank-cli

    A command line interface to mBank.

    No-screenshot

    Project Spotlight

    JessyInk

    A script that turns an Inkscape SVG image into a presentation.