Implementation Considerations

When integrating Intel® QAT into your application stack, consider the following:

  • Platform Compatibility: Ensure that your hardware platform supports Intel® QAT.

  • Software Integration: Follow the performance optimization guide provided by Intel to seamlessly integrate Intel® QAT software with your application.

  • Workload Profiling: Identify workloads that benefit most from hardware acceleration. Not all tasks require Intel® QAT, so choose wisely.

Platform Compatibility

To check that your hardware platform supports Intel® QAT, perform the following command:

lspci -nn | egrep -e '8086:37c8|8086:19e2|8086:0435|8086:6f54|8086:4940|8086:4942|8086:4944|8086:4946'

The output from a high-end 4th Gen Intel® Xeon® Scalable Processor is similar to the following:

6b:00.0 Co-processor [0b40]: Intel Corporation Device [8086:4940] (rev 40)
70:00.0 Co-processor [0b40]: Intel Corporation Device [8086:4940] (rev 40)
75:00.0 Co-processor [0b40]: Intel Corporation Device [8086:4940] (rev 40)
7a:00.0 Co-processor [0b40]: Intel Corporation Device [8086:4940] (rev 40)
e8:00.0 Co-processor [0b40]: Intel Corporation Device [8086:4940] (rev 40)
ed:00.0 Co-processor [0b40]: Intel Corporation Device [8086:4940] (rev 40)
f2:00.0 Co-processor [0b40]: Intel Corporation Device [8086:4940] (rev 40)
f7:00.0 Co-processor [0b40]: Intel Corporation Device [8086:4940] (rev 40)

Software Integration

Intel® QAT can be accessed through various software frameworks, enabling acceleration for specific workloads. Let’s explore some of these frameworks:

Compression Applications

Here is the software stack for compression based applications.

../../_images/compressionstack.png

From this stack we see that applications can utilize QAT by:

Coding directly to the Intel QuickAssist Compression APIs

Application developers can directly access QAT features through the Intel® QuickAssist API. This API provides an easy interface between customer applications and the QuickAssist acceleration driver. It allows seamless integration of Intel® QAT capabilities into custom software.

Useful links:

Code to QATzip APIs

Application developers can directly access compression capabilities by programming to QATzip APIs. This API provides an easy interface between customer applications and the QuickAssist acceleration driver. It allows seamless integration of QAT capabilities into custom software. QATzip APIs also provide the benefit of software fallback which allows your application to work on systems with and without QAT.

Useful links:

Use Data Compression Applications that have been updated for Intel® QAT

A number of applications/libraries have already been updated to take advantage of Intel® QAT. These include:

Crypto Applications

Here is the software stack for encryption based applications.

../../_images/cryptostack.png

From this stack we see that applications can utilize Intel® QAT by:

Coding directly to the Intel QuickAssist Cryptographic APIs

Application developers can directly access Intel® QAT features through the Intel® QuickAssist API. This API provides an easy interface between customer applications and the QuickAssist acceleration driver. It allows seamless integration of Intel® QAT capabilities into custom software.

Useful links:

OpenSSL Applications

Many applications utilize OpenSSL for their crypto need. Examples of applications include NGINX, HAProxy, and ssh. QAT_Engine was designed to fit into OpenSSL’s modular framework and allows applications to offload their crypto needs to Intel® QAT hardware as well as to optimized software libraries that take advantage off CPU instructions. For performance reasons, ensure the application is able to interact with OpenSSL using async.

Useful links:

Use Crypto Applications that have been updated for Intel® QAT

A number of applications/libraries have already been updated to take advantage of Intel® QAT. These include:

Workload Profiling

To help answer the question on whether Intel® QAT can bring value to your application we need to identify hotspots in the application. Application profiling can be used for this purpose.

perf top

perf top is a powerful tool for real-time system profiling, allowing you to analyze CPU usage at the function level. Unlike traditional tools like top, which focus on processes or threads, perf top provides insights into how much CPU time specific functions consume.

Let’s dive into how to use it effectively.

Prerequisites

Before using perf top, ensure you have the following:

  • Installed perf

To install perf on Debian based distros:

sudo apt update
sudo apt install linux-tools-common

To install perf on RPM based distros:

sudo dnf update
sudo dnf install perf

Running perf top

  1. Open a terminal with root access.

  2. Start the perf top monitoring interface:

  3. The monitoring interface will display information similar to the following:

../../_images/perftop.png

Interpreting the Output

The perf top interface provides several columns:

  • Overhead: Displays the percentage of CPU time used by each function.

  • Shared Object: Shows the program or library associated with the function.

  • Symbol: Displays the function name or symbol.

In the example above, there is significant time spent in the gzip library. Given the benefit that Intel® QAT brings to compression, this application is a prime candidate to benefit from Intel® QAT.

Flamegraphs

Flame graphs are powerful visualizations for analyzing application performance, particularly when profiling CPU usage. They allow you to identify hotspots and bottlenecks in your code. These graphs utilize the profile data captured from perf top.