Huffman-Only Compression and Decompression#
The Intel® Query Processing Library (Intel® QPL) supports Huffman-Only compression and decompression.
Warning
The implementation of Huffman-only compression/decompression is in progress.
To use the hardware for non-DEFLATE usages, three advanced flags can be specified:
QPL Flag |
Operation |
Result |
---|---|---|
Compression or Decompression |
Huffman tokens are in Big Endian format |
|
Decompression |
Parse only Huffman tokens |
|
Compression |
Write no headers or EOBs |
|
Compression only |
Generate only literals |
The QPL_FLAG_GEN_LITERALS
can be used with any compression job.
If QPL_FLAG_HUFFMAN_BE
is specified, then QPL_FLAG_NO_HDRS
must be specified as well.
The only allowed combination of these are:
none
, QPL_FLAG_NO_HDRS
, QPL_FLAG_NO_HDRS
+ QPL_FLAG_HUFFMAN_BE
.
The QPL_FLAG_HUFFMAN_BE
flag tells the hardware that the Huffman tokens are in a
16-bit big-endian format.
When the QPL_FLAG_NO_HDRS
flag is used for decompress jobs, it should be used for
all jobs, although it is needed for the FIRST job and the LAST job
(i.e. the jobs with the QPL_FLAG_FIRST
and QPL_FLAG_LAST
flags set).
When a decompress job has QPL_FLAG_NO_HDRS
and QPL_FLAG_FIRST
set
, it instructs the driver to configure the hardware to expect the bit-stream to start with a
Huffman token rather than with a block header. It also requires a
decompress Huffman table to be provided, which is used to configure the
hardware appropriately.
When a decompress job has QPL_FLAG_NO_HDRS
and QPL_FLAG_LAST
set
, it instructs the driver to tell the hardware not to expect the stream to end with an EOB token.
The decompress Huffman table can be constructed by the application, or
it can be constructed by the auxiliary functions
qpl_huffman_table_init_with_triplets()
,
qpl_huffman_table_init_with_histogram()
or
qpl_huffman_table_init_with_other()
.
Refer to the Huffman Table Objects section for more information.
If QPL_FLAG_NO_HDRS
and QPL_FLAG_GEN_LITERALS
flags are used for decompression,
then the pointer to the decompression Huffman table must be non-null.
Either it must point to a reserved memory area where the table be created
in the case QPL_FLAG_DYNAMIC_HUFFMAN
, or to already created table otherwise.
When the QPL_FLAG_NO_HDRS
flag is used for compress jobs, it instructs the driver
not to write any block header or trailer (i.e. EOB tokens) to the
stream.
If QPL_FLAG_NO_HDRS
is used with QPL_FLAG_DYNAMIC_HUFFMAN
, then the entire file must be contained in the single block.
This means that both QPL_FLAG_FIRST
and QPL_FLAG_LAST
must
be specified. Also in this case the user must include a compress Huffman
table structure. This structure is to be overwritten with the generated
Huffman Table.
The QPL_FLAG_GEN_LITERALS
flag is only for compress jobs. This instructs the
hardware to generate only literal tokens and no match tokens. Currently,
the decompressor, when using the QPL_FLAG_NO_HDRS
flag, can only parse literal
tokens. So the compressor, when using QPL_FLAG_NO_HDRS
, must use the QPL_FLAG_GEN_LITERALS
flag, otherwise the result would not be decompressed with the Intel QPL.
If QPL_FLAG_GEN_LITERALS
flag is used for compression, then the pointer to the
compression Huffman table must be non-null. Either it must point to a reserved
memory area where the table be created in the case QPL_FLAG_DYNAMIC_HUFFMAN
, or to
already created table otherwise.
Since the Huffman-only compressed output does not have an EOB token, there is no guarantee that the stream ends
at a byte boundary. Therefore, the qpl_job.last_bit_offset
field in the compression job is written to indicate
the number of bits written in the last byte. Set qpl_job.ignore_end_bits
in the decompression job
according to qpl_job.last_bit_offset
in the compression job:
decompression_job_ptr->ignore_end_bits = (8 - compression_job_ptr->last_bit_offset) & 7;
However, if QPL_FLAG_HUFFMAN_BE
is specified, every 16-bit word is processed as a unit. Therefore,
the values in qpl_job.last_bit_offset
and qpl_job.ignore_end_bits
can be up to 15, instead of 7.
Set qpl_job.ignore_end_bits
in the decompression job as:
decompression_job_ptr->ignore_end_bits = (16 - compression_job_ptr->last_bit_offset) & 15;
Warning
The first generation of Intel® In-Memory Analytics Accelerator (Intel® IAA) hardware has a limitation
that does not allow ignore_end_bits to be greater than 7. Therefore, use the Software
Path if the Huffman-only decompression job has qpl_job.ignore_end_bits
set to a value greater than 7.
Big Endian 16 Format#
Normal DEFLATE streams are little-endian (LE). Tokens are written starting at bit-0 of each byte and extending from bit-7 of byte-0 to bit-0 of byte-1.
For example, if there are four 5-bit tokens, then would be written to the first 3 bytes as:
3 |
3 |
3 |
3 |
3 |
3 |
3 |
3 |
2 |
2 |
2 |
2 |
2 |
2 |
2 |
2 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
Bytes |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
Bits |
… |
… |
5 |
5 |
5 |
5 |
5 |
4 |
4 |
4 |
4 |
4 |
3 |
3 |
3 |
3 |
3 |
2 |
2 |
2 |
2 |
2 |
1 |
1 |
1 |
1 |
1 |
0 |
0 |
0 |
0 |
0 |
Tokens |
The Huffman codes can be used either in non-bit-reversed, or bit-reversed form:
Non-bit-reversed |
Bit-reversed |
---|---|
000 |
000 |
001 |
100 |
0100 |
0010 |
0101 |
1010 |
0110 |
0110 |
0111 |
1110 |
1000 |
0001 |
1001 |
1001 |
10100 |
00101 |
The difference is:
the non-bit-reversed forms need to be parsed starting at the high-order bit,
the bit-reversed forms need to be parsed starting at the low-order bit.
Normal (LE) DEFLATE streams use the Bit-reversed form, as the tokens are parsed starting at bit-0.
In the Big-Endian-16 format, the tokens are written in each 16-bit Word, starting at the high-order bit:
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
Words |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
3 |
3 |
3 |
3 |
3 |
3 |
3 |
3 |
2 |
2 |
2 |
2 |
2 |
2 |
2 |
2 |
Bytes |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
Bits |
0 |
0 |
0 |
0 |
0 |
1 |
1 |
1 |
1 |
1 |
2 |
2 |
2 |
2 |
2 |
3 |
3 |
3 |
3 |
3 |
4 |
4 |
4 |
4 |
4 |
5 |
5 |
5 |
5 |
5 |
… |
… |
Tokens |
Here, the non-bit-reversed form of the Huffman Tokens needs to be used.
When the data (while being read/written to the user’s buffer) is bit-reversed within every 16-bit word, after the bit reversal, it looks like:
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
Words |
2 |
2 |
2 |
2 |
2 |
2 |
2 |
2 |
3 |
3 |
3 |
3 |
3 |
3 |
3 |
3 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
Bytes |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
Bits |
… |
… |
5 |
5 |
5 |
5 |
5 |
4 |
4 |
4 |
4 |
4 |
3 |
3 |
3 |
3 |
3 |
2 |
2 |
2 |
2 |
2 |
1 |
1 |
1 |
1 |
1 |
0 |
0 |
0 |
0 |
0 |
Tokens |
Attention
In the figure above, “Words”, “Bytes”, and “Bits” refer to the original value before the bit reversal.
After the bit-reversal, the tokens appear as if the input stream is encoded in LE format. To process BE16 data, all we need to bit-reverse each 16-bit word as we read it or write it, and otherwise pretend that it is LE data. Note that as we pretending that the data is LE, we need to use the bit-reversed form of the Huffman Codes as well.