Smart Cache Loader

Smart Cache Loader is a very configurable Web grabber with special Smart Cache support.

Tags Internet Web Site Management Link Checking Indexing/Search web crawler
Licenses GPL
Operating Systems OS Independent
Implementation Java
Translations English

Tweet this project Short link

Rss Recent releases

  • Rrelease-mid
  •  10 Aug 2007 11:51
  • Rrelease-after

Changes: Saving pages to local disk now works, and the program now uses the Host: header in outgoing requests for better virtual server support. HTML entities are decoded before extracting links, and gzip-encoded pages are requested from the server.

  • Rrelease-mid
  •  08 Aug 2007 13:37
  • Rrelease-after

Changes: Crawler can now extract links from page content using regular expresions (with possible replacement for URL rewriting). Crawler can now log depth for easier debugging, and tracking of known URLs can be set to two modes (the first saves memory, second CPU).

  • Rrelease-mid
  •  25 Jul 2007 06:53
  • Rrelease-after

Changes: Support was added for escaping "&" and "," in URLs. The delay parameter can now take time units like 1.3s and 2h. A new per-site parameter, "crawltime", (which works on the command line too) was added for limiting the time spent on crawling a site.

  • Rrelease-mid
  •  15 Apr 2007 06:35
  • Rrelease-after

Changes: Support was added for crawling delays. Links of reject type are now logged, which is good for extracting URLs from a site. A crash which occurred when no default masks were used was fixed.

  • Rrelease-mid
  •  13 Apr 2007 10:49
  • Rrelease-after

Changes: This release fixes various crashes. The documentation has been converted to Docbook and updated. A .jar file is distributed instead of .class files.

6e8041ae344e85c3cd339735a9e05db3_thumb

Project Spotlight

conexus I/O Library

A C++ library with support for sockets and serial I/O and a gtkmm widget set.

No-screenshot

Project Spotlight

Class::Date

Provides a date data-type for Perl.