Acceleration Driver
Intel® QAT can accelerate the following services:
Symmetric cryptography
Public key cryptography
Data compression/decompression
The Intel® QAT Endpoints are exposed as PCI devices. Applications running in user space typically access these services via the Intel® QAT APIs. Applications that run in the Linux* kernel can also access some services via the Linux* Kernel Cryptographic Framework (LKCF) API.
Controlling the Driver
Two methods are provided to manage the acceleration driver. They include:
qat_service
: script to manage the Intel® QAT Endpoints.
adf-ctl
: Utility for loading configuration files and sending events to the driver.
qat_service
The qat_service
script is installed with the software package in the /etc/init.d/
directory. The script allows a user to start, stop, or
query the status (up or down) of a single Intel® QAT Endpoint or all Intel® QAT Endpoints in the system.
qat_service Usage
To view all Intel® QAT Endpoints in the system, use:
service qat_service status
If for example, there are two Intel® QAT Endpoints in the system, the output will be similar to the following:
qat_dev0 - type: c6xx, inst_id: 0, bsf: 06:00:0, #accel: 5 #engines: 10 state: up qat_dev1 - type: c6xx, inst_id: 1, bsf: 83:00:0, #accel: 5 #engines: 10 state: up
Other options are also available:
service qat_service start||stop||status||restart||shutdown
For a system with multiple Intel® QAT Endpoints, you can start, stop or restart each individual device by passing the Intel® QAT Endpoint
to be restarted or stopped as a parameter qat_dev<N>
, for example:
service qat_service stop qat_dev0 service qat_service stop qat_dev1
The shutdown qualifier enables the user to bring down all Intel® QAT Endpoints and unload driver modules from the kernel. This contrasts with the stop qualifier, which brings down one or more Intel® QAT Endpoints, but does not unload kernel modules, so other Intel® QAT Endpoints can still run.
adf_ctl
The adf_ctl
user space utility is separate to the driver and provides a mechanism for:
Loading configuration file data to the kernel driver. The kernel space driver uses the data and also provides the data to the user space driver.
Sending events to the driver to bring devices up and down.
The adf_ctl
provided with the Intel® QAT 2.0 driver can also be used to interface with Intel® QAT 1.6 and 1.7 devices.
adf_ctl Usage
To bring up, down, restart or reset device(s):
adf_ctl [-c|--config] [qat_dev] [up|down|restart|reset]
To print device(s) status:
adf_ctl [qat_dev] status
To use the specified configuration file:
-c (--config) [config/file/path]
Note
If no device (physical or virtual) is selected, this file is used against all existing devices.
Examples
To bring device 0 down:
adf_ctl qat_dev0 down
To load device configuration from default path (e.g. /etc/4xxx_dev1.conf
), then bring device 1 up:
adf_ctl qat_dev1 up
To load device configuration from specified path /etc/4xxx_dev1.conf
and bring device 1 up:
adf_ctl -c /etc/user_4xxx_dev1.conf qat_dev1 up
To restart all devices with default configuration files:
adf_ctl restart
To restart all devices with specified configuration file /etc/user_c4xxx_dev1.conf
:
adf_ctl -c /etc/user_4xxx_dev1.conf restart
To restart device 0 with specified configuration file ~/user_4xxx_dev1.conf
:
adf_ctl -c ~/user_c4xxx_dev1.conf qat_dev0 restart
To restart device 0:
adf_ctl qat_dev0 reset
Application Payload Memory Allocation
When performing offload operations through the Intel® QAT API, it is required that the payload data be placed in a buffer that is resident, physically contiguous, and DMA accessible from the acceleration hardware. It is the application’s responsibility to provide buffers with these constraints.
Buffers are passed to the API with virtual addresses. The API translates these addresses to the address information required by the hardware.
Services
Service |
API |
Reference |
---|---|---|
Cryptographic service |
|
See the Intel® QuickAssist Technology Cryptographic API Reference Manual (refer to Table 2) for details. |
Data Compression service |
|
See the Intel® QuickAssist Technology Data Compression API Reference Manual (refer to Table 2) for details. |
When the software requires the physical address, it calls the registered function.
Note
This address translation function is called at least once per request. Consequently, for optimal performance, the implementation of this function should be optimized.
If using the Intel® QAT Data Plane API, buffers are passed to the Intel® QAT API as physical addresses. The library passes this directly to the hardware, without the need for translation.
Thread Specific USDM
By default, memory allocation uses the USDM slab allocator, which gives 2MB contiguous memory. The allocation has locks in the library to prevent a race condition in getting the memory from the slab.
This lock has an impact on some multi-threaded applications and use cases, like HAProxy, causing a drop in performance.
To mitigate this issue, thread specific USDM is implemented which allocates and handles memory specific to threads. (For multi-thread apps, allocated memory information will be maintained separately for each thread).
This feature can be enabled by configuring with the configure flag:
--enable-icp-thread-specific-usdm
In some use cases with thread specific USDM, using a 128K slab allocator instead of the default 2MB allocator could improve performance and reduce memory consumption for a large number of threads. This can be enabled by configuring with the configure flag
--enable-128k-slab
Note
There is a limitation with thread specific USDM: memory allocated in one thread should be freed only by the thread which allocates it.
Incorrect cleanup can lead to a segmentation fault (segfault).
Also, memory allocated in a thread is freed automatically when the thread exits/terminates, even if the user does not explicitly free the memory.
See the ./configure flags` section of the Getting Started Guide for more information on these flags.
Important
We have observed poor multithreaded performance with QAT_Engine using OpenSSL* at higher thread counts.
Unfortunately, these issues appear to stem from the way OpenSSL*
implements its engine_table_select
and locks. For relevant issues on the OpenSSL*
github pages, see the two issues below:
OpenSSL* 1.1.1.x: Performance bottleneck with locks in
engine_table_select()
function #18509, https://github.com/openssl/openssl/issues/18509OpenSSL* 3.0: 3.0 performance degraded due to locking #20286, https://github.com/openssl/openssl/issues/20286
Return Codes
This table shows the return codes used by various components of the acceleration driver, defined in $ICP_ROOT/quickassist/include/cpa.h
.
Return Type |
Return Code |
Description |
---|---|---|
|
0 |
Requested operation was successful. |
|
-1 |
General or unspecified error occurred. Refer to the console log user space application or to |
|
-2 |
Recoverable error occurred. Refer to relevant sections of the API for specifics on what the suggested course of action. |
|
-3 |
Required resource is unavailable. The resource that has been requested is unavailable. Refer to relevant sections of the API for specifics on what the suggested course of action. |
|
-4 |
Invalid parameter has been passed in. |
|
-5 |
Fatal error has occurred. A serious error has occurred. Recommended course of action is to shut down and restart the component. |
|
-6 |
The function is not supported, at least not with the specific parameters supplied. This may be because a particular capability is not supported by the current implementation. |
|
-7 |
The API implementation is restarting. This may be reported if, for example, a hardware implementation is undergoing a reset. |
Linux* Device Driver Operations Return Codes
This table shows the return codes used by the driver to handle Linux* device driver operations.
Return Type |
Return Code |
Description |
---|---|---|
|
0 |
Operation was successful. |
|
1 |
General error occurred. Refer to the console log user space application or to |
|
-1 |
Operation is not permitted. Used during ioctl operations. |
|
-2 |
No such file or directory. |
|
-4 |
Interrupted system call. |
|
-5 |
Input/Output error occurred. Used when copying configuration data to and from user space. |
|
-9 |
Bad File Number. Used when an invalid file descriptor is detected. |
|
-11 |
Try Again. Used when a recoverable operation occurred. |
|
-12 |
Out of Memory. Memory resource that has been requested is not available. |
|
-13 |
Permission Denied. Used when the operation failed to connect to a process or open a device. |
|
-14 |
Bad Address. Used when an operation detects invalid parameter data. |
|
-16 |
Device or resource is busy. |
|
-17 |
File exists. |
|
-19 |
No Such Device. Used when an operation detects invalid device id. |
|
-22 |
Invalid argument. |
|
-25 |
Invalid Command Type. Used when an ioctl operation detects an invalid command type. |
|
-28 |
No space left on device. |
|
-34 |
Math result not representable. |
|
-38 |
Function not implemented. |
|
-46 |
Level 3 Halted. |
|
-62 |
Timer expired. |
|
-74 |
Not a data message. |
|
-75 |
Value too large for defined data type. |
|
-95 |
Operation not supported on transport endpoint. |
|
-115 |
Operation now in progress. |