Rate Limiting
Rate Limiting Overview
Rate Limiting is implemented by monitoring the utilization of the device on a per-VF, per-service basis and comparing that to the SLA allocated to that VF and service. This ensures that resources are allocated according to predefined agreements, preventing any single VF from monopolizing device capacity.
Rate Limiting is set up on the host system, allowing administrators to manage and allocate resources effectively. It can be configured for each Physical Function (PF) on the device, providing granular control over resource distribution across different services and virtual functions.
Resources are shared across guests, and the resource utilization of each guest is measured relative to the capacity of the physical function. The feature is supported for SYM, ASYM, and DC services.
This document provides instructions for enabling Rate Limiting for both Out-of-Tree (OOT) and In-tree stacks.
Out-of-Tree Rate Limiting
Enabling Rate Limiting
To enable the Rate Limiting feature for the Out-of-Tree stack:
Install the driver package on the host with Single-Root Input/Output Virtualization (SR-IOV) enabled.
Set
ServicesEnabled
toasym
orsym
ordc
(or any combination of up to two of these services).Perform
qat_service shutdown
andqat_service start
.
Important
For Out-of-Tree (OOT) PKE, the total CIR for all SLAs should equal 1000 to ensure proper rate limiting. For symmetric crypto and data compression services, the total CIR should equal the total capacity as returned by the sla_mgr tool.
Service Level Agreement (SLA)
Service Level Agreement enforcement allocates a specified amount of capacity for a specified service to a specified VF: max SLA enforced = (number of VFs) X (number of services) where:
Number of VFs varies based on device type
Number of services = 2 (asymmetric or symmetric or compression)
SLA Units
SLA units are measured as follows:
Symmetric Crypto - 1Mbps of throughput.
Asymmetric Crypto - 1 unit is equal to 0.1 percent of available utilization.
Compression - 1Mbps of throughput.
Note
In Gen4 devices, for Asymmetric Crypto services, SLA units are measured in terms of percentage of slice utilization. This metric is more accurate than operations/second or throughput/second because it directly reflects the hardware resources consumed by each user, independent of algorithm processing speed, providing a fair representation of resource usage.
Gen4 devices use a Hardware-assisted Rate limiting approach whereas legacy devices use a firmware-only Rate limiting approach.
For asymmetric service, SLAs shall be allocated at a granularity of 1 unit of device utilization percentage for RSA2K.
Below is a sample mapping table for the 5th Gen Intel® Xeon® Scalable Processer - MCC SKU platform that translates the SLA units to equivalent ops/sec.
Users can run tests with the required algorithm to determine the mapping for other SKUs with different performance. Gen4 asymmetric performance numbers can differ based on the SKU.
Unit |
RSA2K decrypt with CRT (Ops/sec) |
RSA4K decrypt with CRT (Ops/sec) |
---|---|---|
1 |
60 |
– |
5 |
300 |
– |
10 |
600 |
60 |
50 |
3000 |
300 |
100 |
6000 |
600 |
300 |
18000 |
1800 |
500 |
30000 |
3000 |
750 |
45000 |
4500 |
1000 |
60000 |
6000 |
SLA Manager Application
The sla_mgr
tool is used to create, update, delete, list and get SLA capabilities.
The SLA Manager executable is available in $ICP_ROOT/build/sla_mgr
after the package is built and installed using ./configure; make install
commands.
SLA Commands
Operation |
Command |
---|---|
Rate Limiting V1 (Legacy) |
|
Create SLA |
|
Update SLA |
|
Rate Limiting V2 |
|
Create SLA |
|
Update SLA |
|
For Legacy and Rate Limiting V2 |
|
Delete SLA |
|
Delete all SLAs |
|
Query SLA capabilities |
|
Query list of SLAs |
|
Options:
pf_addr
- Physical address in domain:bus:device.function(xxxx:xx:xx.x) format.vf_addr
- Virtual address in domain:bus:device.function(xxxx:xx:xx.x) format.Service
- Asym(=0) or Sym(=1) or DC(=2).rate_in_sla_units
- [ 0-MAX]. MAX is found by querying the capabilities.cir/pir
- committed/peak information rate [0-MAX]. MAX is found by querying the capabilities.sla_id
- Value returned bycreate
command.
In Legacy mode, to create/update SLA we use rate_in_sla_units. With Rate Limiting V2, we use cir/pir. These units are equal to:
1 operation per second - for asymmetric service (Legacy) or 0.1 percent of available utilization (Rate Limiting V2).
1 Megabits per second - for symmetric service/compression service.
Note
To use Legacy Rate limiting sla_mgr application, user needs to configure with option –enable-legacy-sla-mgr.
Best Practices for SLA Management
Ensure all VFs are included in SLAs to prevent unregulated resource usage.
Regularly monitor performance and adjust CIR and PIR values as needed to maintain optimal throughput.
In-tree Rate Limiting
Note
For additional details on Rate Limiting with the In-tree solution, refer to sysfs-driver-qat_rl documentation.
Rate Limiting for the in-tree stack is configured per individual Physical Function (PF) using sysfs calls. Each PF has a directory structure that includes several files used to manage SLAs:
Directory Structure
The rate limiting attributes for each PF are located at:
/sys/bus/pci/devices/<BDF>/qat_rl/
The files included in this directory are:
cap_rem: Reports the remaining capability for a particular service/SLA. This is the remaining value that a new SLA can be set to or a current SLA can be increased with.
cir: Committed Information Rate (CIR). The guaranteed rate of throughput that a VF can achieve under its SLA. The value is expressed in permille scale, i.e., 1000 refers to the maximum device throughput for a selected service.
id: Used to retrieve a particular SLA and operate on it. Valid for update, rm, and get operations.
pir: Peak Information Rate (PIR). The maximum rate that can be achieved by that particular SLA. An SLA can reach a value between CIR and PIR when the device is not fully utilized by requests from other users.
rp: Configures the ring pairs associated with an SLA. The value is a 64-bit bit mask and is written/displayed in hex.
sla_op: Used to perform operations on an SLA, such as add, update, rm, rm_all, and get.
srv: Represents the service (sym, asym, dc) associated with an SLA.
Enabling Rate Limiting
To enable the Rate Limiting feature for the In-tree stack:
Identify the device using the Bus-Device-Function (BDF) format, e.g., <BDF>.
Configure the SLA using the sysfs attributes available for qat_4xxx devices.
Ensure the total CIR for all VFs equals 1000 to ensure proper rate limiting.
SLA Units
For the In-tree stack, SLA units are measured as follows:
All services (sym, asym, dc) - 1 unit is equal to 0.1 percent of available utilization.
Example Setting of SLAs
This example demonstrates setting up SLAs for all VFs for a specified PF, focusing on symmetric and asymmetric crypto services. The RP value is shifted for each VF to allocate resources appropriately.
Remove Existing SLAs: Clear any existing SLAs for the device.
echo "rm_all" > /sys/bus/pci/devices/0000:6b:00.0/qat_rl/sla_op
Set SLAs for Symmetric Service: For each VF, set the SLA parameters and add the SLA:
for vf in {0..15}; do cir_pir_value=62 rp_value=$(printf "0x%x" $((0xa << (vf * 4)))) echo $cir_pir_value > /sys/bus/pci/devices/0000:6b:00.0/qat_rl/cir echo $cir_pir_value > /sys/bus/pci/devices/0000:6b:00.0/qat_rl/pir echo "sym" > /sys/bus/pci/devices/0000:6b:00.0/qat_rl/srv echo $rp_value > /sys/bus/pci/devices/0000:6b:00.0/qat_rl/rp echo "add" > /sys/bus/pci/devices/0000:6b:00.0/qat_rl/sla_op echo "SLA added for BDF: 0000:6b:00.0, VF: $vf, Service: sym, RP: $rp_value, CIR/PIR: $cir_pir_value" done
Set SLAs for Asymmetric Service: Similarly, set the SLA parameters for the asymmetric service:
for vf in {0..15}; do cir_pir_value=62 rp_value=$(printf "0x%x" $((0x5 << (vf * 4)))) echo $cir_pir_value > /sys/bus/pci/devices/0000:6b:00.0/qat_rl/cir echo $cir_pir_value > /sys/bus/pci/devices/0000:6b:00.0/qat_rl/pir echo "asym" > /sys/bus/pci/devices/0000:6b:00.0/qat_rl/srv echo $rp_value > /sys/bus/pci/devices/0000:6b:00.0/qat_rl/rp echo "add" > /sys/bus/pci/devices/0000:6b:00.0/qat_rl/sla_op echo "SLA added for BDF: 0000:6b:00.0, VF: $vf, Service: asym, RP: $rp_value, CIR/PIR: $cir_pir_value" done
This example illustrates setting up rate limiting for one QAT endpoint, evenly distributing the PF capacity among 16 VFs for both symmetric and asymmetric services.