You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
356 lines
12 KiB
356 lines
12 KiB
LZMA SDK 9.38 |
|
------------- |
|
|
|
LZMA SDK provides the documentation, samples, header files, |
|
libraries, and tools you need to develop applications that |
|
use 7z / LZMA / LZMA2 / XZ compression. |
|
|
|
LZMA is an improved version of famous LZ77 compression algorithm. |
|
It was improved in way of maximum increasing of compression ratio, |
|
keeping high decompression speed and low memory requirements for |
|
decompressing. |
|
|
|
LZMA2 is a LZMA based compression method. LZMA2 provides better |
|
multithreading support for compression than LZMA and some other improvements. |
|
|
|
7z is a file format for data compression and file archiving. |
|
7z is a main file format for 7-Zip compression program (www.7-zip.org). |
|
7z format supports different compression methods: LZMA, LZMA2 and others. |
|
7z also supports AES-256 based encryption. |
|
|
|
XZ is a file format for data compression that uses LZMA2 compression. |
|
XZ format provides additional features: SHA/CRC check, filters for |
|
improved compression ratio, splitting to blocks and streams, |
|
|
|
|
|
|
|
LICENSE |
|
------- |
|
|
|
LZMA SDK is written and placed in the public domain by Igor Pavlov. |
|
|
|
Some code in LZMA SDK is based on public domain code from another developers: |
|
1) PPMd var.H (2001): Dmitry Shkarin |
|
2) SHA-256: Wei Dai (Crypto++ library) |
|
|
|
You can copy, modify, distribute and perform LZMA SDK code, even for commercial purposes, |
|
all without asking permission. |
|
|
|
LZMA SDK code is compatible with open source licenses, for example, you can |
|
include it to GNU GPL or GNU LGPL code. |
|
|
|
|
|
LZMA SDK Contents |
|
----------------- |
|
|
|
Source code: |
|
|
|
- C / C++ / C# / Java - LZMA compression and decompression |
|
- C / C++ - LZMA2 compression and decompression |
|
- C / C++ - XZ compression and decompression |
|
- C - 7z decompression |
|
- C++ - 7z compression and decompression |
|
- C - small SFXs for installers (7z decompression) |
|
- C++ - SFXs and SFXs for installers (7z decompression) |
|
|
|
Precomiled binaries: |
|
|
|
- console programs for lzma / 7z / xz compression and decompression |
|
- SFX modules for installers. |
|
|
|
|
|
UNIX/Linux version |
|
------------------ |
|
To compile C++ version of file->file LZMA encoding, go to directory |
|
CPP/7zip/Bundles/LzmaCon |
|
and call make to recompile it: |
|
make -f makefile.gcc clean all |
|
|
|
In some UNIX/Linux versions you must compile LZMA with static libraries. |
|
To compile with static libraries, you can use |
|
LIB = -lm -static |
|
|
|
Also you can use p7zip (port of 7-Zip for POSIX systems like Unix or Linux): |
|
|
|
http://p7zip.sourceforge.net/ |
|
|
|
|
|
Files |
|
----- |
|
|
|
DOC/7zC.txt - 7z ANSI-C Decoder description |
|
DOC/7zFormat.txt - 7z Format description |
|
DOC/installer.txt - information about 7-Zip for installers |
|
DOC/lzma.txt - LZMA compression description |
|
DOC/lzma-sdk.txt - LZMA SDK description (this file) |
|
DOC/lzma-history.txt - history of LZMA SDK |
|
DOC/lzma-specification.txt - Specification of LZMA |
|
DOC/Methods.txt - Compression method IDs for .7z |
|
|
|
bin/installer/ - example script to create installer that uses SFX module, |
|
|
|
bin/7zdec.exe - simplified 7z archive decoder |
|
bin/7zr.exe - 7-Zip console program (reduced version) |
|
bin/x64/7zr.exe - 7-Zip console program (reduced version) (x64 version) |
|
bin/lzma.exe - file->file LZMA encoder/decoder for Windows |
|
bin/7zS2.sfx - small SFX module for installers (GUI version) |
|
bin/7zS2con.sfx - small SFX module for installers (Console version) |
|
bin/7zSD.sfx - SFX module for installers. |
|
|
|
|
|
7zDec.exe |
|
--------- |
|
7zDec.exe is simplified 7z archive decoder. |
|
It supports only LZMA, LZMA2, and PPMd methods. |
|
7zDec decodes whole solid block from 7z archive to RAM. |
|
The RAM consumption can be high. |
|
|
|
|
|
|
|
|
|
Source code structure |
|
--------------------- |
|
|
|
|
|
Asm/ - asm files (optimized code for CRC calculation and Intel-AES encryption) |
|
|
|
C/ - C files (compression / decompression and other) |
|
Util/ |
|
7z - 7z decoder program (decoding 7z files) |
|
Lzma - LZMA program (file->file LZMA encoder/decoder). |
|
LzmaLib - LZMA library (.DLL for Windows) |
|
SfxSetup - small SFX module for installers |
|
|
|
CPP/ -- CPP files |
|
|
|
Common - common files for C++ projects |
|
Windows - common files for Windows related code |
|
|
|
7zip - files related to 7-Zip |
|
|
|
Archive - files related to archiving |
|
|
|
Common - common files for archive handling |
|
7z - 7z C++ Encoder/Decoder |
|
|
|
Bundles - Modules that are bundles of other modules (files) |
|
|
|
Alone7z - 7zr.exe: Standalone 7-Zip console program (reduced version) |
|
Format7zExtractR - 7zxr.dll: Reduced version of 7z DLL: extracting from 7z/LZMA/BCJ/BCJ2. |
|
Format7zR - 7zr.dll: Reduced version of 7z DLL: extracting/compressing to 7z/LZMA/BCJ/BCJ2 |
|
LzmaCon - lzma.exe: LZMA compression/decompression |
|
LzmaSpec - example code for LZMA Specification |
|
SFXCon - 7zCon.sfx: Console 7z SFX module |
|
SFXSetup - 7zS.sfx: 7z SFX module for installers |
|
SFXWin - 7z.sfx: GUI 7z SFX module |
|
|
|
Common - common files for 7-Zip |
|
|
|
Compress - files for compression/decompression |
|
|
|
Crypto - files for encryption / decompression |
|
|
|
UI - User Interface files |
|
|
|
Client7z - Test application for 7za.dll, 7zr.dll, 7zxr.dll |
|
Common - Common UI files |
|
Console - Code for console program (7z.exe) |
|
Explorer - Some code from 7-Zip Shell extension |
|
FileManager - Some GUI code from 7-Zip File Manager |
|
GUI - Some GUI code from 7-Zip |
|
|
|
|
|
CS/ - C# files |
|
7zip |
|
Common - some common files for 7-Zip |
|
Compress - files related to compression/decompression |
|
LZ - files related to LZ (Lempel-Ziv) compression algorithm |
|
LZMA - LZMA compression/decompression |
|
LzmaAlone - file->file LZMA compression/decompression |
|
RangeCoder - Range Coder (special code of compression/decompression) |
|
|
|
Java/ - Java files |
|
SevenZip |
|
Compression - files related to compression/decompression |
|
LZ - files related to LZ (Lempel-Ziv) compression algorithm |
|
LZMA - LZMA compression/decompression |
|
RangeCoder - Range Coder (special code of compression/decompression) |
|
|
|
|
|
Note: |
|
Asm / C / C++ source code of LZMA SDK is part of 7-Zip's source code. |
|
7-Zip's source code can be downloaded from 7-Zip's SourceForge page: |
|
|
|
http://sourceforge.net/projects/sevenzip/ |
|
|
|
|
|
|
|
LZMA features |
|
------------- |
|
- Variable dictionary size (up to 1 GB) |
|
- Estimated compressing speed: about 2 MB/s on 2 GHz CPU |
|
- Estimated decompressing speed: |
|
- 20-30 MB/s on modern 2 GHz cpu |
|
- 1-2 MB/s on 200 MHz simple RISC cpu: (ARM, MIPS, PowerPC) |
|
- Small memory requirements for decompressing (16 KB + DictionarySize) |
|
- Small code size for decompressing: 5-8 KB |
|
|
|
LZMA decoder uses only integer operations and can be |
|
implemented in any modern 32-bit CPU (or on 16-bit CPU with some conditions). |
|
|
|
Some critical operations that affect the speed of LZMA decompression: |
|
1) 32*16 bit integer multiply |
|
2) Mispredicted branches (penalty mostly depends from pipeline length) |
|
3) 32-bit shift and arithmetic operations |
|
|
|
The speed of LZMA decompressing mostly depends from CPU speed. |
|
Memory speed has no big meaning. But if your CPU has small data cache, |
|
overall weight of memory speed will slightly increase. |
|
|
|
|
|
How To Use |
|
---------- |
|
|
|
Using LZMA encoder/decoder executable |
|
-------------------------------------- |
|
|
|
Usage: LZMA <e|d> inputFile outputFile [<switches>...] |
|
|
|
e: encode file |
|
|
|
d: decode file |
|
|
|
b: Benchmark. There are two tests: compressing and decompressing |
|
with LZMA method. Benchmark shows rating in MIPS (million |
|
instructions per second). Rating value is calculated from |
|
measured speed and it is normalized with Intel's Core 2 results. |
|
Also Benchmark checks possible hardware errors (RAM |
|
errors in most cases). Benchmark uses these settings: |
|
(-a1, -d21, -fb32, -mfbt4). You can change only -d parameter. |
|
Also you can change the number of iterations. Example for 30 iterations: |
|
LZMA b 30 |
|
Default number of iterations is 10. |
|
|
|
<Switches> |
|
|
|
|
|
-a{N}: set compression mode 0 = fast, 1 = normal |
|
default: 1 (normal) |
|
|
|
d{N}: Sets Dictionary size - [0, 30], default: 23 (8MB) |
|
The maximum value for dictionary size is 1 GB = 2^30 bytes. |
|
Dictionary size is calculated as DictionarySize = 2^N bytes. |
|
For decompressing file compressed by LZMA method with dictionary |
|
size D = 2^N you need about D bytes of memory (RAM). |
|
|
|
-fb{N}: set number of fast bytes - [5, 273], default: 128 |
|
Usually big number gives a little bit better compression ratio |
|
and slower compression process. |
|
|
|
-lc{N}: set number of literal context bits - [0, 8], default: 3 |
|
Sometimes lc=4 gives gain for big files. |
|
|
|
-lp{N}: set number of literal pos bits - [0, 4], default: 0 |
|
lp switch is intended for periodical data when period is |
|
equal 2^N. For example, for 32-bit (4 bytes) |
|
periodical data you can use lp=2. Often it's better to set lc0, |
|
if you change lp switch. |
|
|
|
-pb{N}: set number of pos bits - [0, 4], default: 2 |
|
pb switch is intended for periodical data |
|
when period is equal 2^N. |
|
|
|
-mf{MF_ID}: set Match Finder. Default: bt4. |
|
Algorithms from hc* group doesn't provide good compression |
|
ratio, but they often works pretty fast in combination with |
|
fast mode (-a0). |
|
|
|
Memory requirements depend from dictionary size |
|
(parameter "d" in table below). |
|
|
|
MF_ID Memory Description |
|
|
|
bt2 d * 9.5 + 4MB Binary Tree with 2 bytes hashing. |
|
bt3 d * 11.5 + 4MB Binary Tree with 3 bytes hashing. |
|
bt4 d * 11.5 + 4MB Binary Tree with 4 bytes hashing. |
|
hc4 d * 7.5 + 4MB Hash Chain with 4 bytes hashing. |
|
|
|
-eos: write End Of Stream marker. By default LZMA doesn't write |
|
eos marker, since LZMA decoder knows uncompressed size |
|
stored in .lzma file header. |
|
|
|
-si: Read data from stdin (it will write End Of Stream marker). |
|
-so: Write data to stdout |
|
|
|
|
|
Examples: |
|
|
|
1) LZMA e file.bin file.lzma -d16 -lc0 |
|
|
|
compresses file.bin to file.lzma with 64 KB dictionary (2^16=64K) |
|
and 0 literal context bits. -lc0 allows to reduce memory requirements |
|
for decompression. |
|
|
|
|
|
2) LZMA e file.bin file.lzma -lc0 -lp2 |
|
|
|
compresses file.bin to file.lzma with settings suitable |
|
for 32-bit periodical data (for example, ARM or MIPS code). |
|
|
|
3) LZMA d file.lzma file.bin |
|
|
|
decompresses file.lzma to file.bin. |
|
|
|
|
|
Compression ratio hints |
|
----------------------- |
|
|
|
Recommendations |
|
--------------- |
|
|
|
To increase the compression ratio for LZMA compressing it's desirable |
|
to have aligned data (if it's possible) and also it's desirable to locate |
|
data in such order, where code is grouped in one place and data is |
|
grouped in other place (it's better than such mixing: code, data, code, |
|
data, ...). |
|
|
|
|
|
Filters |
|
------- |
|
You can increase the compression ratio for some data types, using |
|
special filters before compressing. For example, it's possible to |
|
increase the compression ratio on 5-10% for code for those CPU ISAs: |
|
x86, IA-64, ARM, ARM-Thumb, PowerPC, SPARC. |
|
|
|
You can find C source code of such filters in C/Bra*.* files |
|
|
|
You can check the compression ratio gain of these filters with such |
|
7-Zip commands (example for ARM code): |
|
No filter: |
|
7z a a1.7z a.bin -m0=lzma |
|
|
|
With filter for little-endian ARM code: |
|
7z a a2.7z a.bin -m0=arm -m1=lzma |
|
|
|
It works in such manner: |
|
Compressing = Filter_encoding + LZMA_encoding |
|
Decompressing = LZMA_decoding + Filter_decoding |
|
|
|
Compressing and decompressing speed of such filters is very high, |
|
so it will not increase decompressing time too much. |
|
Moreover, it reduces decompression time for LZMA_decoding, |
|
since compression ratio with filtering is higher. |
|
|
|
These filters convert CALL (calling procedure) instructions |
|
from relative offsets to absolute addresses, so such data becomes more |
|
compressible. |
|
|
|
For some ISAs (for example, for MIPS) it's impossible to get gain from such filter. |
|
|
|
|
|
|
|
--- |
|
|
|
http://www.7-zip.org |
|
http://www.7-zip.org/sdk.html |
|
http://www.7-zip.org/support.html
|
|
|