|
About:
grzip is a high-performance file compressor based
on Burrows-Wheeler Transform, Schindler Transform,
Move-To-Front, and Weighted Frequency Counting. It
uses the Block-Sorting Lossless Data Compression
Algorithm, which has received considerable
attention in recent years for both its simplicity
and effectiveness. This implementation has a
compression rate of 2.234 bps on the Calgary
Corpus (14 files) without preprocessing filters.
This is essentially an adaptation/extension of
GRZipII by Ilya Grebnov.
Author:
Jean-Pierre Demailly <demailly (at) fourier (dot) ujf (dash) grenoble (dot) fr>
[contact developer]
Homepage:
http://magicssoft.ru/?folder=projects&page=GRZipII
Tar/BZ2:
ftp://ftp.ac-grenoble.fr/ge/compression/grzip-0.3.0.tar.bz2
Trove categories:
[change]
Dependencies:
[change]
No dependencies filed
|
|
» Rating:
(not rated)
» Vitality: 0.00% (Rank 16026)
» Popularity: 0.58% (Rank 10305)

(click to enlarge graphs)
Record hits: 6,899
URL hits: 1,576
Subscribers: 14
|
|
Branches
Comments
[»]
Not 64-bit safe, and horrible coding...
by Tobias Klausmann - Jan 2nd 2007 06:23:46
First off: use the FM download links, the package linked form the
homepage doesn't include the same files.
Both packages also have these troubles (which stem from identical or
very similar source code):
- decompression on 64-bit systems doesn't work at all (I tested a variety
of files and they all couldn't be unpacked (CRC errors).
- the source code is very hairy. Try compiling it with -Wall instead of
the author's -Wno-error (sic!). No wonder it's not 64-bit safe.
- the makefile is broken. Hard-coded CPU-specific optimization
(-march=pentium) and over-the-top optimization (-O7) are both bad ideas. As
is -ffast-math.
In benchmarks (take those with a grain of salt, the binary is broken!),
I found that compression is significantly slower than bzip2 (which isn't
fast, either). And the resulting files weren't noticably smaller than when
compressed with gzip or bzip2.
In short: if you're on a 64-bit machien, avoid.
Word of caution: once you compress a file, the uncompressed file gets
deleted (bzip2 and gzip do the same). Unfortunately, with that, your file
is gone forever as you can't decompress it!
[reply]
[top]
[»]
Re: Not 64-bit safe, and horrible coding...
by Jean-Pierre Demailly - Jan 2nd 2007 11:05:12
Thanks to Tobias Klausmann for his detailed comments. First of all, I have
to say that most of the coding is not mine - Ilya Grebnov is the author of
99% of the code - my early tests were made on a 32 bit Intel machine,
without any attempt of mine to make the code more portable.
The sole purpose of this FM announcement was to get Ilya Grebnov's efforts
more widely known, as I think grzip is still a very promising program in
spite of its obvious current limitations (that's why it's still only at
version 0.2.5 !)
(1) As far as rate of compression is concerned, grzip performs *always
better* than other similar open source compressors (at least in all the
tests I made...)
(2) As far as execution time is concerned, grzip is sometimes slower than
bzip2 by maybe 50%, sometimes faster by a similar margin, depending on the
nature of the files. An interesting case is tar archives of recent Linux
kernels : while compressing better than bzip2 (the gain is almost 6
MBytes!), grzip also requires less time in that particular case.
- Getting 64-bit safe code is probably only a matter of getting people
working on it - one of the additional reasons I felt useful to advertize
this program.
All in all, I still believe that grzip could be turned into a very useful
general purpose compressor after a few iterations.
[reply]
[top]
[»]
Re: Not 64-bit safe, and horrible coding...
by Tobias Klausmann - Jan 2nd 2007 11:15:21
Thing is, the program's status is marked as "5 - Production/Stable". And
that it definitely isn't.
I do see room for improvement in the area of file compressors. Both
bzip2 and gzip are nice attempts at the problem -- but they aren't anywhere
near the leading edge of compression research/information theory.
My comment was mainly intended from keeping some user from compressing
all their essays and texts they've written, only to find out they're gone
because they unfortunately use a 64 bit machien and OS. Sure, one should
test-drive programs before entrusting them with data (and they should have
backups).
[reply]
[top]
[»]
Re: Not 64-bit safe, and horrible coding...
by Jean-Pierre Demailly - Jan 2nd 2007 13:28:19
> Thing is, the program's status is marked as "5 - Production/Stable".
And that it definitely isn't.
OK, that was a wrong interpretation of mine - my tests didn't show any
problem on x86 (at least after I corrected one rather dull segfault
problem) - and comments from testers on the Internet, mostly on Windows,
didn't mention any issues. I have anyway recategorized 'grzip' as alpha
since it is still a work in progress. It seems nevertheless more reliable
than that on x86.
[reply]
[top]
[»]
Re: Not 64-bit safe, and horrible coding...
by Tobias Klausmann - Jan 2nd 2007 13:35:05
Thanks. That (and our discussion here) should provide enough hints to
anyone who's willing to test drive.
Unfortunately my C knowledge is so bad that wouldn't improve a single
thing if I laid hands on it. Otherwise, I'd be glad to do some polishing.
[reply]
[top]
|