An attempt to compress the first 100 MB of Wikipedia which is called enwik8 using LZW(Lempel–Ziv–Welch) and BZip2-Like algorithms with variable length encoding.
- LZW:
- Compression ratio: 2.905
- Compressed file size: 32 MB
- BZip2-Like:
- Compression ratio: 3.855
- Compressed file size: 24 MB
- Compression
- Open a terminal on the directory containing the code
- Generate the binary file using command:
g++ -o encoder.exe encoder.cpp
- Run the binary file:
./encoder.exe
- Decompression
- Open a terminal on the directory containing the code
- Generate the binary file using command:
g++ -o decoder.exe decoder.cpp
- Run the binary file:
./decoder.exe
- A Decoder for the BZip2-Like algorithm