Xapian is a search engine library, scalable to collections containing hundreds of millions of documents. It's written in C++ with bindings for Perl, Python, PHP, Java, Tcl, C#, and Ruby. It is a highly adaptable toolkit that allows developers to easily add advanced indexing and search facilities to their own applications. It supports the Probabilistic Information Retrieval model and also a rich set of boolean query operators. Omega is a Web search application built upon the Xapian library. It can index a Web server's document tree (including HTML, PDF, OpenOffice, MS Word/Excel/Powerpoint/Works, WordPerfect, RTF, PS, etc.), or data exported from arbitrary sources (e.g. SQL databases).
| Tags | Information Management Internet Web Indexing/Search Software Development Libraries Perl Modules php classes Python Modules Text Processing Indexing |
|---|---|
| Licenses | GPL |
| Operating Systems | POSIX Windows Windows Unix |
| Implementation | C++ Perl PHP Python Java Tcl Ruby C# |
Recent releases


Changes: This release includes several performance improvements in the matcher, adds a new "valuenumeric" action to scriptindex for indexing for numeric sorting, and works around mod_python bugs which caused lock-ups when using the Python bindings under mod_python.


Changes: WritableDatabase::remove_spelling() now works properly. The QueryParser now handles scripts which use NON_SPACING_MARK Unicode characters (such as Arabic) better. The distribution of OP_NEAR and OP_PHRASE over a non-leaf subquery is improved. The database locking code no longer leaves a zombie child process when the database is already locked.


Changes: This release fixes several bugs and adds support to Omega for indexing MS Office 2007 formats and XPS files.


Changes: This release fixes a possible case of database corruption if the disk fills up while writing out changes. The lockfile for a flint database is now created using the umask setting. Previously, it wasn't possible to open a flint database for update if it was owned by another user, even with sufficient permissions via "group" or "other". Composing an OP_NEAR query with two non-term subqueries now throws UnimplementedError instead of AssertionError.


Changes: Spelling correction is now even faster. (A 15% speed up was measured.) Two bugs caused by excess precision on x86 Linux have been fixed. Query::MatchAll now gives equal weights to all documents. A crash while compacting the spelling table has been fixed. The copydatabase example now copies user metadata too. The omega CGI binary now catches and reports std::exception.
- All comments
Recent commentsRe: 1.0.1 ABI breakage
Argh. Keep forgetting double quotes. The links were:
http://sisyphus.ru/srpm/xapian-core/changelog (http://sisyphus.ru/srpm/xapian-core/changelog)
http://recoll.org/ (http://recoll.org/)
Re: 1.0.1 ABI breakage
> If you wish to communicate with the
> Xapian team, please use our mailing
> lists. I've only just noticed your
> comment here!
I thought of this as a fm-related thing, not a bugreport of sorts... Will dup this message by email.
Heh, and I only learned about release announce on the wiki (going from freshmeat notification to the download page or so) in quite some time too :-) And really couldn't find the exact link at that moment. Probably my [short-sighted] fault.
Here's how 1.0.x were noted in ALT Linux package's changelog.
> We don't have "Changes", but the first
> entry in the NEWS file for 1.0.1
> discusses them. It is also the first
> thing mentioned in the
> 1.0.1 release overview.
Er... I meant freshmeat's "Changes" field and would refer to "ChangeLog" for a file proper, sorry for wrong brevity :-)
> While we'd intended to avoid ABI bumps within
> the 1.0.x series (which apart from 1.0.1
> we've achieved I'm happy to say), it's
> not like users had long expectations of
> ABI stability from Xapian just after
> 1.0.0 was released. If we had to make
> an incompatible ABI change in 1.0.x now,
> I'd certainly agree that it would be
> more noteworthy.
Yeah, and some projects would call that 1.0.1 an 1.1.0 just for that reason -- and feel quite okay about the numbering which *might* suggest some more outstanding feature change but *should* at least hint that something important changed.
Anyways, I'm sorry you spent so much time replying in detail, the whole "issue" wasn't worth it.
> If we'd had a double-free bug
> which might affect real code, that would
> certainly justify more publicity.
Ah. And I might have been too jumpy due to the need to change ABI in a stable distribution just frozen (or released?) back then. Fortunately the only client app (recoll) was packaged by me too so there was little sync problem :-)
Olly, thank you for both detailed explanation of the particular cause and *very* decent software I use weekly to daily with much delight, and also for being one of my personally favorite upstreams!
Re: 1.0.1 ABI breakage
If you wish to communicate with the Xapian team, please use our mailing lists (http://xapian.org/lists). I've only just noticed your comment here!
> Probably both double free introduced in
> 1.0.0 and ABI change while fixing it are
> worth being mentioned in Changes
We don't have "Changes", but the first entry in the NEWS file for 1.0.1 discusses them. It is also the first thing mentioned in the
1.0.1 release overview (http://trac.xapian.org/wiki/ReleaseOverview/1.0.1).
> and a bold typefase on xapian.org
That would be rather disproportionate I think. Taking the two issues separately:
Prior to 1.0.0, we used to bump the ABI whenever it was expedient. While we'd intended to avoid ABI bumps within the 1.0.x series (which apart from 1.0.1 we've achieved I'm happy to say), it's not like users had long expectations of ABI stability from Xapian just after 1.0.0 was released. If we had to make an incompatible ABI change in 1.0.x now, I'd certainly agree that it would be more noteworthy.
The double-free bug is unfortunate, but copying Xapian::Error objects isn't a pattern which naturally occurs in user code (or in the library code either) - you just throw and catch them. Copying is mostly allowed just for consistency with other Xapian classes. I don't believe that any users would actually have been affected by this problem, and it was spotted while I was reading the source code rather than manifesting in an application. But it's not the sort of bug that we were comfortable leaving in place. If we'd had a double-free bug which might affect real code, that would certainly justify more publicity.
> I've rather occasionally edited an URL to
> browse release notes for 1.0.1.
There's no need to edit URLs - they're all linked to from
ReleaseNotes (http://trac.xapian.org/wiki/ReleaseNotes) on the wiki (which is linked to from the
download page (http://xapian.org/download)).
1.0.1 ABI breakage
Probably both double free introduced in 1.0.0 and ABI change while fixing it are worth being mentioned in Changes and a bold typefase on xapian.org -- I've rather occasionally edited an URL to browse release notes for 1.0.1.
Guess such a high quality indexer and search engine (we mainly use it with recoll.org) does deserve proper handling of even unfortunate news.