Projects / GNU libextractor

GNU libextractor

libextractor is a library used to extract meta-data from files of arbitrary type. It is designed to use helper-libraries to perform the actual extraction, and to be trivially extendable by linking against external extractors for additional file types. The goal is to provide developers of file-sharing networks, file managers, and WWW-indexing bots with a universal library to obtain meta-data about files. It includes a shell-command and bindings for Java (JNI) and Python.

Tags Software Development Libraries Internet Web Indexing/Search Communications File Sharing Text Processing Indexing
Licenses GPL
Operating Systems POSIX BSD Unix Linux Windows Windows Mac OS X
Implementation C

Tweet this project Short link

Rss Recent releases

  • Rrelease-mid
  •  05 Jul 2009 08:35
  • Rrelease-after

Changes: This release adds support for librpm 4.7 and uses an external version of libexiv2 for improved and more up-to-date EXIV2 support.

  • Rrelease-mid
  •  21 Feb 2009 00:39
  • Rrelease-after

Changes: This release fixes various minor bugs in various plugins and the build system. It uses libtool 2.x, which helps fix some issues with multiple threads loading and unloading certain plugins concurrently.

  • Rrelease-mid
  •  03 Nov 2008 05:38
  • Rrelease-after

Changes: This release adds support for the S3M, XM, and IT file formats. RPM support now requires librpm. Crashes in the OpenOffice and tiff plugins were fixed.

  • Rrelease-mid
  •  14 Jul 2008 03:56
  • Rrelease-after

Changes: This release fixes locale paths (for translations). It also ensures that plugin loading and unloading are thread-safe. Some linkage errors on OpenBSD were resolved. An experimental thumbnail extractor based on ffmpeg was added (but is not enabled by default due to security concerns).

  • Rrelease-mid
  •  27 Apr 2008 08:16
  • Rrelease-after

Changes: This release fixes a security issue recently found in the XPDF code (which is not enabled by default).

Rss Recent comments

Rcomment-before 02 Feb 2008 05:00 Rcomment-trans 8b42991419a8ae2e357b624e6c277578_tiny grothoff Rcomment-after

Re: online demo not working
There are two PDF plugins, one that is quite

simplistic and another one based on code from

xpdf (which has a bad security track record).

Depending on which one I happen to enable on the

website (options to configure), you get more or

less information for PDF files.

> When I upload dmca.pdf all it gives me

> is mimetype. Am I missing something?

Rcomment-before 24 Jan 2008 15:40 Rcomment-trans baloney Rcomment-after

online demo not working
When I upload dmca.pdf all it gives me is mimetype. Am I missing something?

Rcomment-before 14 Aug 2005 21:25 Rcomment-trans 8b42991419a8ae2e357b624e6c277578_tiny grothoff Rcomment-after

Re: Also Requires gobject-2.0
Note that as of 0.5.3 LE still needs gobject-2.0 but the
ordinary shared version will do fine now.

Rcomment-before 27 Jan 2005 10:15 Rcomment-trans 8b42991419a8ae2e357b624e6c277578_tiny grothoff Rcomment-after

Re: Also Requires gobject-2.0
Well, gobject-2.0 is part of glib, so it is listed as a
dependency. What is more tricky is that we need the
static, relocatable version of the library -- but try to specify
that on freshmeat :-).

Rcomment-before 27 Jan 2005 10:07 Rcomment-trans dforce Rcomment-after

Also Requires gobject-2.0
Can't seem to get the OLE2 libraries to compile, make complains:

/usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.3/../../../../i686-pc-linux-gnu/bin/ld: cannot find -lgobject-2.0

Oh, and you may want to include these dependencies within either the README or INSTALL files.

6ad13c65f3cff6e8913de16558816023_thumb

Project Spotlight

Zoph

A PHP/MySQL photo album/gallery/management system.

No-screenshot

Project Spotlight

RWTH Mindstorms NXT Toolbox

A Matlab Interface to the Lego Mindstorms NXT.