Quick Start

Contents

Quick Start#

Low-Level C API#

We walk through an example to learn the basic workflow of Intel® Query Processing Library (Intel® QPL) low-level C API. The key agent of this API is the qpl_job data structure. To work with Intel QPL low-level C API, the application will need to:

  1. Query the required memory size.

  2. Allocate memory according to the queried size.

  3. Initialize the job structure and fill in necessary parameters.

  4. Pass the job structure (along with the allocated memory) to Intel QPL.

  5. When the operations are finished, free the resources.

The example below compresses and decompresses data with Deflate dynamic Huffman encoding via Intel QPL low-level C API. For our purpose to understand the workflow, we only focus on the compression part here. See the comments after the code block.

 1/*******************************************************************************
 2 * Copyright (C) 2022 Intel Corporation
 3 *
 4 * SPDX-License-Identifier: MIT
 5 ******************************************************************************/
 6
 7//* [QPL_LOW_LEVEL_COMPRESSION_EXAMPLE] */
 8
 9#include <iostream>
10#include <vector>
11#include <memory>
12
13#include "qpl/qpl.h"
14constexpr const uint32_t source_size = 1000;
15
16auto main(int argc, char** argv) -> int {
17    std::cout << "Intel(R) Query Processing Library version is " << qpl_get_library_version() << ".\n";
18
19    // Default to Software Path
20    qpl_path_t execution_path = qpl_path_software;
21
22    // Source and output containers
23    std::vector<uint8_t> source(source_size, 5);
24    std::vector<uint8_t> destination(source_size / 2, 4);
25
26    std::unique_ptr<uint8_t[]> job_buffer;
27    uint32_t                   size = 0;
28
29    // Job initialization
30    qpl_status status = qpl_get_job_size(execution_path, &size);
31    if (status != QPL_STS_OK) {
32        std::cout << "An error " << status << " acquired during job size getting.\n";
33        return 1;
34    }
35
36    job_buffer = std::make_unique<uint8_t[]>(size);
37    qpl_job *job = reinterpret_cast<qpl_job *>(job_buffer.get());
38
39    status = qpl_init_job(execution_path, job);
40    if (status != QPL_STS_OK) {
41        std::cout << "An error " << status << " acquired during job initializing.\n";
42        return 1;
43    }
44
45    // Performing a compression operation
46    job->op            = qpl_op_compress;
47    job->level         = qpl_default_level;
48    job->next_in_ptr   = source.data();
49    job->next_out_ptr  = destination.data();
50    job->available_in  = source_size;
51    job->available_out = static_cast<uint32_t>(destination.size());
52    job->flags         = QPL_FLAG_FIRST | QPL_FLAG_LAST | QPL_FLAG_DYNAMIC_HUFFMAN | QPL_FLAG_OMIT_VERIFY;
53
54    // Compression
55    status = qpl_execute_job(job);
56    if (status != QPL_STS_OK) {
57        std::cout << "An error " << status << " acquired during compression.\n";
58        return 1;
59    }
60
61    const uint32_t compressed_size = job->total_out;
62    status = qpl_fini_job(job);
63    if (status != QPL_STS_OK) {
64        std::cout << "An error " << status << " acquired during job finalization.\n";
65        return 1;
66    }
67
68    return 0;
69}
70
71//* [QPL_LOW_LEVEL_COMPRESSION_EXAMPLE] */

The application only needs to include one header file qpl/qpl.h, which specifies the prototypes of all the functions.

At line 31, we call qpl_get_job_size() to query the required memory size based on the specified execution path.

At lines 37-38, we allocate memory according to the returned value of size. Note that the value of size is greater than the size of the job structure qpl_job. The leading portion of the allocated memory is used to store the job structure, while the remaining portion is a buffer for internal usages.

At line 40, we call qpl_init_job() to initialize the job structure and buffer, then we fill in necessary parameters at lines 47 to 53.

The job structure and the allocated buffer are passed to Intel QPL at line 56. After qpl_execute_job() completes successfully, we can retrieve the results stored in the job structure.

Finally, we call qpl_fini_job() at line 63 to free the resources.

In order to build the library and all the examples, including the one above, follow steps at Building the Library. Compiled examples then would be located in <qpl_library>/build/examples/low-level-api/.

Alternatively, in order to build compression_example.cpp individually using existing Intel QPL installation, use:

g++ -I/<install_dir>/include -o compression_example compression_example.cpp /<install_dir>/lib64/libqpl.a -ldl

Attention

Intel QPL could be also used from C applications. This would still require C++ runtime library installed on the system. You would also need to add -lstdc++ if you are using the static library libqpl.a.

On Linux, if you installed Intel QPL system wide, you can use the dynamic library to compile the examples with:

g++ -I/<install_dir>/include -o compression_example compression_example.cpp -lqpl

In order to build an example using pkg-config for the dynamic library, set the PKG_CONFIG_PATH and compile the example using qpl.pc:

g++ `pkg-config --cflags --libs qpl` -o compression_example compression_example.cpp

To run the example on the Hardware Path (see Execution Paths), use:

./compression_example hardware_path

Attention

Either sudo privileges or elevated permissions are required to initialize Intel QPL job with qpl_path_hardware.

Refer to the Accelerator Configuration section for more details about getting permissions.

Attention

With the Hardware Path, the user must either place the libaccel-config library in /usr/lib64/ or specify the location of libaccel-config in LD_LIBRARY_PATH for the dynamic loader to find it.

Attention

In the example above we do not set qpl_job.numa_id value, so the library will auto detect NUMA node of the calling process and use Intel® In-Memory Analytics Accelerator (Intel® IAA) device(s) located on the same node.

Alternatively, user can set qpl_job.numa_id and set matching numactl policy to ensure that the calling process will be located on the same NUMA node as specified with numa_id.

It is user responsibility to configure accelerator and ensure device(s) availability on the NUMA node.

Refer to NUMA Support section for more details.

To run the example on the Software Path (see Execution Paths), use:

./compression_example software_path

To run the example on the Auto Path (see Execution Paths), use:

./compression_example auto_path

Attention

If Intel QPL is built with -DDYNAMIC_LOADING_LIBACCEL_CONFIG=OFF (see Available Build Options for details), replace -ldl with -laccel-config in the compilation command. The user must either place libaccel-config in /usr/lib64/ or specify the location of libaccel-config (for example, using LD_LIBRARY_PATH and LIBRARY_PATH).

Refer to Developer Guide for more information about Intel QPL low-level C API. For more examples, see Low-Level C API Examples.