Program that compresses and decompresses ASCII files based on Huffman Coding in a canonical manner.
-
In general, use the following to run the archiver program:
$ make build \ && make test \ && make run $ cd build/src \ && ./archiver --compress test.huff ../../tests/test_1.txt ../../tests/test_2.txt \ && ./archiver --decompress test.huff
-
For local development, you can attempt to use:
$ make local-init && make conan-build
./archiver -hdisplays help for using the program../archiver -c archive_name file1 [file2 ...]encodes the filesfil1, file2, ...and saves the result to the filearchive_name../archiver -d archive_namedecodes the files from the archivearchive_nameand puts them in the current directory.
Nine-bit values are written in low-to-high order format (analogous to little-endian for bits). That is, the bit corresponding to 2^0 comes first, followed by 2^1, and so on, up to the bit corresponding to 2^9.
The archive file has the following format:
-
A 9-bit number indicating the number of characters in the alphabet
SYMBOLS_COUNT. -
Data block for recovering the canonical code:
SYMBOLS_COUNTvalues of 9 bits (alphabet characters in the order of canonical codes).- A list of
MAX_SYMBOL_CODE_SIZEvalues of 9 bits, thei-th (when numbered from 0) element of which is the number of characters with the code lengthi + 1.MAX_SYMBOL_CODE_SIZE, the maximum code length in the current encoding, is not explicitly written to the file because it can be deduced from the available data.
-
The encoded file name.
-
The encoded content of the file.
-
The encoded service symbol
FILENAME_END. -
If there are additional files in the archive, the encoded service symbol
ONE_MORE_FILEis used, and the encoding continues. -
The encoded service symbol
ARCHIVE_END.