Running in a Virtual Machine (VM)

This document provides instructions for attaching QAT Virtual Function(s) to the guest VM. This allows us to run applications using qatlib on the Guest VM utilizing the QAT Hardware.

Important

These instructions assume the QAT in-tree driver is currently installed on the host system.

Virtual Machine Configuration

This section describes the configuration options for the virtual machine.

CPU Pinning (Optional)

This step is optional and is only needed when we wish to pin the cores used by a VM. In the example below we are pinning 8 cores used in the VM to cores 17-24.

<vcpu placement='static' cpuset='17-24'>8</vcpu>

Machine Type

The machine type needs to be set q35. In the example below, the machine type for RHEL 9.2 is pc-q35-rhel9.

<os>
    <type arch='x86_64' machine='pc-q35-rhel9.2.0'>hvm</type>
    <boot dev='hd'/>
</os>

I/O APIC Driver

The I/O APIC driver needs to be set to qemu.

<features>
    <acpi/>
    <apic/>
        <ioapic driver='qemu'/>
</features>

IOMMU Settings

IOMMU should be configured as defined below.

<iommu model='intel'>
    <driver intremap='on' caching_mode='on'/>
</iommu>

QAT Virtual Function

When passing VFs to a guest, the BDFs on the guest should facilitate qatlib recognizing whether VFs are from the same PF or not.

So the libvirt XML file should specify that VFs from the same host (same domain + bus) are assigned to a common (domain + bus) on the guest, which is different to the (domain + bus) used for VFs from other PFs.

Important

Sufficient VFs should be passed from the host to the guest to satisfy the type of services and number of processes needed by the guest. See Configuration and Tuning section of QATlib User’s Guide for more information on host configuration.

Important

If using the default kernel configuration, at least 2 VFs are needed per process so that the process has both CY and DC instances. Set either POLICY=0 or POLICY=2 (or 4, 6, …) in /etc/sysconfig/qat on the guest and restart qatmgr.

To list the BDF info from the host, the following command can be used.

lspci -d 8086:4941  && lspci -d 8086:4943  && lspci -d 8086:4945 && lspci -d 8086:4947

Example output:

6b:00.1 Co-processor: Intel Corporation Device 4943 (rev 40)
6b:00.2 Co-processor: Intel Corporation Device 4943 (rev 40)
6b:00.3 Co-processor: Intel Corporation Device 4943 (rev 40)
. . .
70:00.1 Co-processor: Intel Corporation Device 4943 (rev 40)
70:00.2 Co-processor: Intel Corporation Device 4943 (rev 40)
70:00.3 Co-processor: Intel Corporation Device 4943 (rev 40)
. . .
f3:00.1 Co-processor: Intel Corporation Device 4943 (rev 40)
f3:00.2 Co-processor: Intel Corporation Device 4943 (rev 40)
f3:00.3 Co-processor: Intel Corporation Device 4943 (rev 40)
. . .

In the following example we are passing two VFs from 0000:6b.0 and two VFs from 0000:70.0 to the VM.

<hostdev mode='subsystem' type='pci' managed='yes'>
    <source>
        <address domain='0x0000' bus='0x6b' slot='0x00' function='0x1'/>
    </source>
    <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0' multifunction='on'/>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
    <source>
        <address domain='0x0000' bus='0x6b' slot='0x00' function='0x2'/>
    </source>
    <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x1'/>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
    <source>
        <address domain='0x0000' bus='0x70' slot='0x00' function='0x3'/>
    </source>
    <address type='pci' domain='0x0000' bus='0x08' slot='0x00' function='0x0' multifunction='on'/>
</hostdev>
<hostdev mode='subsystem' type='pci' managed='yes'>
    <source>
        <address domain='0x0000' bus='0x70' slot='0x00' function='0x3'/>
    </source>
    <address type='pci' domain='0x0000' bus='0x08' slot='0x00' function='0x1'/>
</hostdev>

Within the VM, the VFs would appear like the following:

lspci | grep -E '4941|4943|4945|4947'
07:00.0 Co-processor: Intel Corporation Device 4941 (rev 40)
07:00.1 Co-processor: Intel Corporation Device 4941 (rev 40)
08:00.0 Co-processor: Intel Corporation Device 4941 (rev 40)
08:00.1 Co-processor: Intel Corporation Device 4941 (rev 40)

Guest OS Linux Boot Parameters

The following kernel boot parameters must be included in the Guest OS.

  • intel_iommu=on

  • aw-bits=48

Confirm these parameters are set with the following commands:

cat /proc/cmdline | grep -E 'intel_iommu=on|aw-bits=48'

If these parameters are not enabled, the following steps can be used to turn them on.

Instructions for Debian Based Distros

If using a Debian based distribution, run the following commands:

  1. Open the grub file.

    sudo vi /etc/default/grub
    
  2. Update the GRUB_CMDLINE_LINUX line by adding intel_iommu=on and aw-bits=48.

  3. Load the changes done.

    sudo grub2-mkconfig -o /etc/grub2-efi.cfg
    

For Ubuntu, you may need to use the “update-grub” command instead.

  1. Reboot the system.

    shutdown -r now
    

Instructions for RHEL/CentOS/Fedora

If using RHEL, CentOS or Fedora, run the following commands:

  1. Update the kernel boot parameters.

    sudo grubby --update-kernel=ALL --args="intel_iommu=on"
    sudo grubby --update-kernel=ALL --args="aw-bits=48"
    
  2. Reboot the system.

    sudo shutdown -r now
    

Guest System Kernel Drivers

When running on the guest system, use the following command to verify the kernel drivers are present:

lsmod | grep vfio_pci

If the kernel module is not found, install it using the following command:

sudo modprobe vfio_pci

In some situations, you may also need to modprobe the qat_4xxx module via sudo modprobe qat_4xxx.

Lastly, verify the vfio_pci kernel driver is bound to each Intel® QAT Virtual Function (VF) by running the following command:

echo `(lspci -vvv  -d 8086:4941 || lspci -vvv -d 8086:4943 || lspci -vvv -d 8086:4945 || lspci -vvv -d 8086:4947) | grep "Kernel driver"`

You should see an output similar to the following:

Kernel driver in use: vfio-pci Kernel driver in use: vfio-pci Kernel driver in use: vfio-pci Kernel driver in use: vfio-pci Kernel driver in use: vfio-pci Kernel driver in use: vfio-pci Kernel driver in use: vfio-pci Kernel driver in use: vfio-pci Kernel driver in use: vfio-pci Kernel driver in use: vfio-pci Kernel driver in use: vfio-pci Kernel driver in use: vfio-pci Kernel driver in use: vfio-pci Kernel driver in use: vfio-pci Kernel driver in use: vfio-pci Kernel driver in use: vfio-pci

Common Issues

Error observed when VM is started

If error similar to the following is reported when VM is started:

sudo virsh start <Guest_VM_NAME>

error: Failed to start domain '<Guest_VM_NAME>'
error: internal error: qemu unexpectedly closed the monitor: 2022-02-10T17:00:01.178436Z qemu-system-x86_64:
-device vfio-pci,host=0000:75:00.1,id=hostdev4,bus=pci.15,addr=0x0: VFIO_MAP_DMA failed: Cannot allocate memory
2022-02-10T17:00:01.210062Z qemu-system-x86_64: -device vfio-pci,host=0000:75:00.1,id=hostdev4,bus=pci.15,addr=0x0:
vfio 0000:75:00.1: failed to setup container for group 488: memory listener initialization failed: Region pc.ram:
vfio_dma_map(0x562127eb3bc0, 0x100000000, 0x80000000, 0x7ff60be00000) = -12 (Cannot allocate memory)

We can use dmesg to display additional error details.

dmesg | tail -50 will display the below error

If the output looks like the following:

[1210160.116507] vfio_pin_pages_remote: RLIMIT_MEMLOCK (20819607552) exceeded
Likely cause:

The hard and soft memory limit has exceeded the limit of 20331648 KB memory

Solution: Increase the memory hard_limit and soft_limit to a higher value

# shutdown the VM
sudo virsh shutdown <Guest_VM_Name>
# edit the guest VM xml file
sudo virsh edit <Guest_VM_Name>

Error with cpa_sample_code

Error when running cpa sample application on the guest VM

If errors like the following are observed:

Terminal output

[error] [error] [error] LacSymCb_ProcessCallbackInternal() - : Response status value not as expected
[error] [error] LacSymQat_SymRespHandler() - : The PCIe End Point Push/Pull or TI/RI Parity error detected.
LacSymQat_SymRespHandler() - : The PCIe End Point Push/Pull or TI/RI Parity error detected.
LacSymCb_ProcessCallbackInternal() - : Response status value not as expected

dmesg output

[56400.152502] DMAR: DRHD: handling fault status reg 3
[56400.152513] DMAR: [DMA Read NO_PASID] Request device [70:00.1] fault addr 0x2995c000 [fault reason 0x79] SM: Read/Write permission error in second-level paging entry
[56400.153297] DMAR: DRHD: handling fault status reg 3
[56400.153308] DMAR: [DMA Read NO_PASID] Request device [70:00.1] fault addr 0x28987000 [fault reason 0x79] SM: Read/Write permission error in second-level paging entry
[56400.212172] DMAR: DRHD: handling fault status reg 3
[56400.212176] DMAR: [DMA Write NO_PASID] Request device [70:00.2] fault addr 0x28465000 [fault reason 0x79] SM: Read/Write permission error in second-level paging entry
[56400.212228] DMAR: DRHD: handling fault status reg 2

Likely cause: Limit to maximum number of concurrent DMA mapping that a user is allowed to create has been reached

Solution: Increase/edit the dma_entry limit value in /etc/modprobe.d/vfio-iommu-type1.config

sudo vi /etc/modprobe.d/vfio-iommu-type1.config

Update the maximum value.