Download the .exe file, or the source at the bottom of the page.
Updates
Version .85 ( Aug. 19, 2006 ) I finally did some more work on Jeffrey. I was a bit embarassed to have the old code on my website, so I spent a day deciphering it and rewriting it. Then the next day I tried a few optimizations, and on my computer it can be up to 50% faster compressing for some files.
Version .81 ( Sep. 10, 2003 ) Source and .exe are updated to fix a bug that could cause data mangling.
Version .80 ( Early 2002 ) Jeffrey P. first put online
Algorithms
When I started on Jeffrey P., I used Limpel-Ziv-Welch compression. The program ran about twice as fast
as it does now, but the compression sizes were disappointing (always noticeably larger than
Winzip sizes, which I believe uses a Limpel-Ziv algorithm and Huffman encoding.)
Plus, I had read about the LZW algorithm being copyrighted, so I a decided to avoid it, even though no one would
care if a little freeware program used it.
The compression algorithm currently used is the Burrow-Wheeler Transformation (also called Block-Sorting) with Arithmetic Compression (and Move to Front and Run-Length Encoding between the two.) The BWT doesn't actually compress the data, but it sorts it in a way where it can be easily compressed by other algorithms. On certain data sets, especially text files, this can provide much better compression than Winzip.
Compression Size Comparisons:
Jeffrey compresses larger files the best usually, especially larger text files. On some other file types or small files, the compression ratios are similar or even very slightly worse than Winzip. "bible.txt" Original : 4,047,392 bytes Jeffrey P. Chillingsworth : 910,099 Winzip : 1,176,617 "koran.txt" Original : 1,133,653 bytes Jeffrey P. Chillingsworth : 319,923 Winzip : 423,303 "civ2\*.wav" Original : 8,887,222 bytes Jeffrey P. Chillingsworth : 4,048,995 Winzip : 5,085,351
There is a downside to using these techniques to achieve better compression though. It takes much longer to compress a file than Winzip. I'd guess maybe 4 times longer. Compression is faster for small files that require less sorting, but certain types of data may result in extremely slow compression. Decompression is faster than compression, because there is no sorting required for undoing the BWT, but the arithmetic compression is a bit slow to decompress.
Jeffrey uses block sizes of up to 900K (files larger than this are split into multiple blocks, and compressed a block at a time.) Since sorting takes the longest time, a Bucket Sort is performed first to divide the data into smaller sets in linear time, then a quicksort is performed on each of the smaller sets.
Using Jeffrey P. Chillingsworth:
Sample Command Lines: Let's say the file book1.txt is in the current directory. (And so is jeff.exe, or jeff.exe is in a path such as c:\windows, that is always checked.)
jeff book1.txt book1
will compress the file book.txt to the archive file book1.jef. If the archive name already exists, you
will be asked whether you want to add to it or overwrite it.
The percentage displayed is the percentage of the size of the compressed file
compared to the original file. So the lower
the percentage, the greater the compression.
jeff l book1
Will list the files in book1.jef
jeff u book1
will uncompress the archive book1.jef. Since it stores filenames, book1.txt will uncompress
You can use wildcards for the name of the compressed file. To compress all the txt files in the current directory to the archive text, type jeff *.txt text
If you just run jeff.exe without any command line options, Jeffrey will ask you what you want him to do.
Download:
Windows console EXE: JeffExe85.zip (46 K)
Source Code:
JeffSrc85.zip (11 K)
jeff.cpp - The code for the command line and text interfaces, and also writing to files.
bwt.cpp - The Burrow Wheelers Transformation. (Encoding and Decoding)
rle.cpp - Run Length Encoding and Decoding
mtf.cpp - Move to Front Encoding and Decoding
ari.cpp - Arithmetic Encoding and Decoding
Be warned though, that the source code isn't especially clear, and there are few explanatory comments.
Update: v.85 code is now partially clear =)
NOTE: I make no guarantees about this program. I'm not responsible for any data that won't uncompress properly, or misuse of the program. (There were a couple data mangling bugs when I was making the program, but they're fixed and it's worked fine for me. I've even used the code in other programming projects.)
This program was based on code from Mark Nelson's Data Compression page
Everything was written or completely rewritten by me, except for the arithmetic compression code, for
which I give credit completely to the original author(s).
Page by: jkreuzer