apt-proxy is a simple script to build up a Debian HTTP mirror based on requests which pass through the proxy. It's great for multiple Debian machines behind a slow link.
| Tags | Internet Proxy Servers Software Distribution Tools Installation/Setup |
|---|---|
| Licenses | GPL |
| Operating Systems | POSIX |
| Implementation | Unix Shell |
Recent releases


Changes: Support for HTTP/FTP backends was added, and there is a new script, apt-proxy-import, for importing existing .debs into apt-proxy's cache.


Changes: More efficient updating of package files for HTTP/FTP, and fixes for two lockfile problems.


Changes: Support for HTTP/FTP backends, and a new script (apt-proxy-import) to import existing .debs into the apt-proxy archive.


Changes: This release adds support for apt-file. It is installable on Debian Potato again. Older cache file versions could be purged later than configured. There are minor packaging fixes.


Changes: More advanced cache management with MAX_VERSIONS and file corruption detection, new log and config file command line options, bugfixes to streaming code and cache management, out-of-the-box readiness (user, cache directory, and logfile are created during first installation), and documentation updates.
- All comments
Recent commentsapt-proxy is alive!
After a long quiet period, apt-proxy has worken up again. There have been many fixes and
improvements, so if you haven't tried it for a while maybe you should have another look :)
http/ftp backend support is scheduled for 1.3.0, and is already in testing.
Re: Why use this ?
I could see this being usefull for large M$ Windows client network environments. Since Windows typically upgrades itself using Windows Update (http://windowsupdate.microsoft.com), this same system could be used to cache the packages from the M$ site.
squid for caching debs...
I've found that squid, at least 2.2.STABLE5 version is not that good for caching large files like debs. The problem is that large downloads often fail before completion, and squid doesn't seem to use "resume" style requests on its retries, it just starts all over again.
The apt client is smart enough to do a "resume" request when squid finally gives up, but depending on how you configure squid, it either starts all over again from the beginning, or it only fetches part of the file that it then doesn't cache. For a 16M deb that can equate to multiple almost complete download attempts that are not even cached when it finally completes, if ever.
Also, squid's expiry model seems to be tuned for multiple small, frequently accessed, and frequently changing objects, not large, rarely accessed and never changing objects. It's psedo LRU expiry seems to be expiring the debs in my cache in favor of smaller objects with disturbing regularity.
apt-proxy is cool because it uses rsync to do the fetches, which in my experience is faster and more reliable than http for large downloads, and can do resumed fetches and delta-updates for objects that only change a little bit (ie Packages files). It also builds a mirror directory structure on demand that can be browsed/exported using other tools (giving ftp/http/whatever access to the same file repository). This makes if perfect for on-demand building and/or mantaining a debian mirror site.
Re: Why use this ?
Mainly because squid will download the Packages.gz file every time it changes: we only xfer the diffs (rsync). The auto-clean (only if a newer package) feature, fallback backends, and the fact that the cache layout maps 1:1 with the backend(s) (re-export cache via NFS/rsync/ftp) also helps.
That said, if you've already got squid up and running, it might be easier.
Rusty.
Why use this ?
I configured all the machines on my net to use the SQUID proxy on my DMZ for both FTP and HTTP. Since they all apt-get from the same Debian mirrors (give or take a handful of unofficial archives), SQUID handles all the caching with very little tweaking (maximum file size, more time until cached entry become stale, etc.). I found that method a more intuitive way to solve the resource mutualisation problem.
Several reasons make it especially efficient :
- packages in the cache have a limited lifespan. Therefore, building a mirror out of requests is only valid until the next package upgrade.
- a typical set-up only selects a fraction of the available packages, even less for a small number of supported hosts.
- using a general purpose caching program such as SQUID limits the additional complexity, and users do not have to change a line in their setup, provided they configured the proxy environment variable right.
As usual, there's more than one way to do it !