HTML::TableExtract

HTML::TableExtract is a Perl module that simplifies the extraction of information from tables within HTML documents. Tables, no matter how nested or clustered, can be targeted symbolically with column headers or by more specific depth and count information.

Tags Internet Web Indexing/Search Software Development Libraries Perl Modules Text Processing Markup HTML/XHTML
Licenses Artistic GPL
Operating Systems OS Independent
Implementation Perl

Tweet this project Short link

Rss Recent releases

  • Rrelease-mid
  •  25 Feb 2006 01:01
  • Rrelease-after

Changes: A subtable slicing bug and an hrow() attachment bug were fixed. Tests were added.

Changes: Tightens up element interactions in TREE() mode when examining rows, columns, cells, etc. Was running into trouble with dereferencing scalars vs objects. The space() H::TE::T method has been documented, and tests have been added. POD tests have been added. There are documentation updates and fixes.

Changes: Tables can now be selected by table tag attributes. The lineage() method now returns row and column information as well as depth and count for each ancestor (a potential backwards incompatibility exists - entries are now 4 element arrays rather than 2). Header matching and column retention enhancements were made. Old-style procedures were deprecated in preparation for them to become methods. Various bugfixes were made.

No-screenshot

Project Spotlight

taskwarrior

A command-line to do list manager.

793d96f1478ad77a3869e06f8c6e8d07_thumb

Project Spotlight

Yana 4 PHP Framework

A full scale PHP component framework including ready-to-use plugins.