Rate Limiting
Note
The instructions in this section apply to the Out-of-Tree QAT package. For details on Rate Limiting with In-tree solution, refer to sysfs-driver-qat_rl documentation.
Rate Limiting is implemented by monitoring the utilization of the device on a per-VF, per-service basis and comparing that to the SLA allocated to that VF and service.
Resources are shared across guests and the resource utilization of each guest is measured relative to the capacity of the physical function.
The feature is supported for SYM, ASYM, and DC services.
To enable the Rate Limiting feature:
Install the driver package on the host with Single-Root Input/Output Virtualization (SR-IOV) enabled.
Set
ServicesEnabled
toasym
orsym
ordc
.Perform
qat_service shutdown
andqat_service start
.
Service Level Agreement (SLA)
Service Level Agreement enforcement allocates a specified amount of capacity for a specified service to a specified VF: max SLA enforced = (number of VFs) X (number of services) where:
Number of VFs varies based on device type
Number of services = 2 (asymmetric or symmetric or compression)
SLA Units
SLA units are measured as follows:
Symmetric Crypto - 1Mbps of throughput.
Asymmetric Crypto - 1 unit is equal to 0.1 percent of available utilization.
Compression - 1Mbps of throughput.
Note
In Gen4 devices, for Asymmetric Crypto services, it is more accurate to use metrics such as slice utilization and PCI bandwidth instead of Operations/second for SLA measurements.
Slice utilization and PCI bandwidth are more fair metrics as they show exactly how much HW resources each user is consuming, regardless of a particular algorithm processing speed.
Gen4 devices use a Hardware-assisted Rate limiting approach whereas legacy devices use a firmware-only Rate limiting approach.
For asymmetric service, SLAs shall be allocated at a granularity of 1 unit of device utilization percentage for RSA2K.
Below is a sample mapping table for the 5th Gen Intel® Xeon® Scalable Processer - MCC SKU platform that translates the SLA units to equivalent ops/sec.
Users can run tests with the required algorithm to determine the mapping for other SKUs with different performance. Gen4 asymmetric performance numbers can differ based on the SKU.
Unit |
RSA2K decrypt with CRT (Ops/sec) |
RSA4K decrypt with CRT (Ops/sec) |
---|---|---|
1 |
60 |
– |
5 |
300 |
– |
10 |
600 |
60 |
50 |
3000 |
300 |
100 |
6000 |
600 |
300 |
18000 |
1800 |
500 |
30000 |
3000 |
750 |
45000 |
4500 |
1000 |
60000 |
6000 |
SLA Manager Application
The sla_mgr
tool is used to create, update, delete, list and get SLA capabilities.
The SLA Manager executable is available in $ICP_ROOT/build/sla_mgr
after the package is built and installed using ./configure; make install
commands.
SLA Commands
Operation |
Command |
---|---|
Rate Limiting V1 (Legacy) |
|
Create SLA |
|
Update SLA |
|
Rate Limiting V2 |
|
Create SLA |
|
Update SLA |
|
For Legacy and Rate Limiting V2 |
|
Delete SLA |
|
Delete all SLAs |
|
Query SLA capabilities |
|
Query list of SLAs |
|
Options:
pf_addr
- Physical address in domain:bus:device.function(xxxx:xx:xx.x) format.vf_addr
- Virtual address in domain:bus:device.function(xxxx:xx:xx.x) format.Service
- Asym(=0) or Sym(=1) or DC(=2).rate_in_sla_units
- [ 0-MAX]. MAX is found by querying the capabilities.cir/pir
- committed/peak information rate [0-MAX]. MAX is found by querying the capabilities.sla_id
- Value returned bycreate
command.
In Legacy mode, to create/update SLA we use rate_in_sla_units. With Rate Limiting V2, we use cir/pir. These units are equal to:
1 operation per second - for asymmetric service (Legacy) or 0.1 percent of available utilization (Rate Limiting V2).
1 Megabits per second - for symmetric service/compression service.
Note
To use Legacy Rate limiting sla_mgr application, user needs to configure with option –enable-legacy-sla-mgr.
SLA Manager Application Demo
Here is a demonstration of using the sla_mgr
tool to create, update, and delete SLAs.
The device utilization script from the previous section is used to visualize the SLAs in action.
The demo consists of the following steps:
Run Encryption workload without SLA in place to observe full utilization.
Create SLA enabling 75% of bandwidth and observe reduced utilization.
Update the SLA to enable 25% of bandwidth and observe reduced utilization.
Remove the SLA and verify full utilization.