Projects / urlwatch

urlwatch

urlwatch is a script intended to help you watch URLs and get notified (via email) of any changes. The change notification will include the URL that has changed and a unified diff of what has changed. The script works out of a single directory, so there is no need to install anything. State files are kept in the same folder. The script supports stripping parts of a page that are always changing through the use of a filter hook function. It is typically run as a cronjob.

Tags Internet Web Dynamic Content Indexing/Search Site Management Link Checking Text Processing Filters Markup
Operating Systems OS Independent
Implementation Python

Tweet this project Short link

Rss Recent releases

  • Rrelease-mid
  •  03 Jan 2009 18:22
  • Rrelease-after

Changes: This version now allows you to convert HTML of Web pages to plain text using either Lynx (via "-dump"), html2text, or simply by stripping all HTML tags via a regular expression. This feature has to be enabled on a per-URL basis in the user-defined hooks.

  • Rrelease-mid
  •  23 Dec 2008 14:17
  • Rrelease-after

Changes: This release adds support for Python 2.6 and above by using the hashlib module instead of the (deprecated) sha module for generating hashes. Python versions before 2.5 are still supported and will use the sha module for generating hashes, just like the previous versions.

  • Rrelease-mid
  •  19 Nov 2008 08:07
  • Rrelease-after

Changes: Support for system-wide installation was added. The ~/.urlwatch/ directory is used for user settings. The BSD license is used. A setup.py script was added. Command-line options and verbose logging mode were added. Example files are copied on first start. A Unix manual page was added.

  • Rrelease-mid
  •  14 Nov 2008 14:39
  • Rrelease-after

Changes: This release adds support for cleaning bad HTML (long lines, etc.) with python-utidylib (W3C's HTMLTidy) and adds a module and support for converting iCalendar (*.ics) files to plaintext for easy-to-use iCalendar watching.

  • Rrelease-mid
  •  16 May 2008 11:21
  • Rrelease-after

Changes: This version adds support for sending a correct User-agent header to the remote HTTP server.

No-screenshot

Project Spotlight

pdf2djvu

A PDF to DjVu converter.

No-screenshot

Project Spotlight

NoxBot

A generic IRC bot.