Intel® QAT Data Compression API

This section describes the sample code for the Intel® QuickAssist Technology Data Compression API, beginning with an API overview, and followed by descriptions of various scenarios to illustrate the usage of the API.

Note

This document does not cover data integrity concepts. Refer to Related Documents and References (Programmer’s Guide, Compress and Verify (CnV) related APIs) for your product for important information on data integrity concepts, including the Compress-and-Verify feature.

Overview

The Intel® QuickAssist Technology Data Compression API can be categorized into three broad areas as follows:

  • Common: This includes functionality for the initialization and shutdown of the service.

  • Instance Management: A given implementation of the API can present multiple instances of the compression service, each representing a logical or virtual device. Request order is guaranteed within a given instance of the service.

  • Transformation:

    • Compression functionality

    • Decompression functionality

These areas of functionality are defined in cpa_dc.h and cpa_dc_dp.h.

The Intel® QAT Data Compression API uses the base API, which defines base data types used across all services of the Intel® QAT API.

Sessions

Similar to the symmetric cryptography API, the data compression API has the concept of a session. In the case of the compression API, a session is an object that describes the compression parameters to be applied across several requests. These requests might submit buffers within a single file, or buffers associated with a particular data stream or flow. A session object is described by the following:

  • The compression level: Lower levels provide faster compression and the cost of compression ratio, whereas higher levels provide a better compression ratio as the cost of performance.

  • The compression algorithm: Compression algorithm to use (e.g. deflate) and what type of Huffman trees to use (static or dynamic).

  • The session direction: If all requests on this session are compression requests, then the direction can be set to compress (and similarly, for decompress). A combined direction is also available if both compression and decompression requests are called using the same session.

  • The session state: A session can be described as stateful or stateless. Stateful sessions maintain history and state between calls to the API, and stateless sessions do not.

    • Stateless compression does not require history data from a previous compression/decompression request to be restored before submitting the request. Stateless sessions are used when the output data is known to be constrained in size. An overflow condition (when the output data is about to exceed the output buffer) is treated as an error condition in the decompression direction. In the compression direction, the application can keep submitting data from where the overflow was registered in the input stream. The Data Plane API treats overflow as an error. In this case, the overflow is treated as an error rather than an exception. The client application is required to resubmit the job in its entirety with a larger output buffer. Requests are treated independently; state and history are not saved and restored between calls.

When using a stateless session, it is possible to feed a seed checksum to the cpaDcCompressData() or the cpaDcDecompressData() API when the CPA_DC_FLUSH_FULL flush flag is used. The user application is responsible for maintaining the checksum across requests. This feature is also known as Stateful Lite.

Stateful sessions are required when the data to be decompressed is larger than the buffers being used. This is a standard mode of operation for applications such as GZIP, where the size of the uncompressed data is not known before execution, and therefore the destination buffer may not be large enough to hold the resultant output. Requests to stateful sessions are not treated independently, and state and history can be saved and restored between calls. The amount of history and state carried between calls depends on the compression level. For stateful decompression, only one outstanding request may be in-flight at any one time for that session.

Stateful Data Compression

This example demonstrates the usage of the synchronous API, specifically using this API to perform a compression operation. It compresses a file via a stateful session using the deflate compress algorithm with static Huffman trees and using GZIP style headers and footers.

These samples are located in /dc/stateful_sample.

Note

Stateful data compression is not available in Intel® QAT v1.8 and later releases. However, stateful decompression is available in Intel® QAT v1.8 and later releases.

Session Establishment

This is the main entry point for the sample compression code. It demonstrates the sequence of calls to be made to the API to create a session, perform one or more compress operations, and then tear down the session. At this point, the instance has been discovered and started, and the capabilities of the instance have been queried and found to be suitable.

A session is established by describing a session, determining how much session memory is required, and then invoking the session initialization function cpaDcInitSession. See below example.

sd.compLevel = CPA_DC_L4;
sd.compType = CPA_DC_DEFLATE;
sd.huffType = CPA_DC_HT_STATIC;
sd.sessDirection = CPA_DC_DIR_COMBINED;
sd.sessState = CPA_DC_STATEFUL;
#if (CPA_DC_API_VERSION_NUM_MAJOR == 1 && CPA_DC_API_VERSION_NUM_MINOR < 6)
sd.deflateWindowSize = 7;
#endif
sd.checksum = CPA_DC_CRC32;

/* Determine size of session context to allocate */
PRINT_DBG("cpaDcGetSessionSize\n");
status = cpaDcGetSessionSize(dcInstHandle, &sd, &sess_size, &ctx_size);

if (CPA_STATUS_SUCCESS == status) {
    /* Allocate session memory */
    status = PHYS_CONTIG_ALLOC(&sessionHdl, sess_size);
}

if ((CPA_STATUS_SUCCESS == status) && (ctx_size != 0)) {
    /* Allocate context bufferlist */
    status = cpaDcBufferListGetMetaSize(dcInstHandle, 1, &buffMetaSize);

    if (CPA_STATUS_SUCCESS == status) {
        status = PHYS_CONTIG_ALLOC(&pBufferMeta, buffMetaSize);
    }

    if (CPA_STATUS_SUCCESS == status) {
        status = OS_MALLOC(&pBufferCtx, bufferListMemSize);
    }

    if (CPA_STATUS_SUCCESS == status) {
        status = PHYS_CONTIG_ALLOC(&pCtxBuf, ctx_size);
    }

    if (CPA_STATUS_SUCCESS == status) {
        pFlatBuffer = (CpaFlatBuffer *)(pBufferCtx + 1);

        pBufferCtx->pBuffers = pFlatBuffer;
        pBufferCtx->numBuffers = 1;
        pBufferCtx->pPrivateMetaData = pBufferMeta;
        pFlatBuffer->dataLenInBytes = ctx_size;
        pFlatBuffer->pData = pCtxBuf;
    }
}

/* Initialize the Stateful session */
if (CPA_STATUS_SUCCESS == status) {
    PRINT_DBG("cpaDcInitSession\n");
    status = cpaDcInitSession(d
            cInstHandle,
            sessionHdl, /* Session memory */
            &sd, /* Session setup data */
            pBufferCtx, /* Context buffer */
            NULL); /* Callback function NULL for sync mode */
}

Note

Source and destination buffers must be established.

numBuffers = 1; /* Only using 1 buffer in this case */

/* Allocate memory for bufferlist and array of flat buffers in a contiguous
 * area and carve it up to reduce number of memory allocations required. */
bufferListMemSize = sizeof(CpaBufferList) + (numBuffers * sizeof(CpaFlatBuffer));

status = cpaDcBufferListGetMetaSize(dcInstHandle, numBuffers, &bufferMetaSize);

/* Allocate source buffer */
if (CPA_STATUS_SUCCESS == status) {
    status = PHYS_CONTIG_ALLOC(&pBufferMetaSrc, bufferMetaSize);
}
if (CPA_STATUS_SUCCESS == status) {
    status = OS_MALLOC(&pBufferListSrc, bufferListMemSize);
}
if (CPA_STATUS_SUCCESS == status) {
    status = PHYS_CONTIG_ALLOC(&pSrcBuffer, SAMPLE_BUFF_SIZE);
}

/* Allocate destination buffer the same size as source buffer */
if (CPA_STATUS_SUCCESS == status) {
    status = PHYS_CONTIG_ALLOC(&pBufferMetaDst, bufferMetaSize);
}
if (CPA_STATUS_SUCCESS == status) {
    status = OS_MALLOC(&pBufferListDst, bufferListMemSize);
}
if (CPA_STATUS_SUCCESS == status) {
    status = PHYS_CONTIG_ALLOC(&pDstBuffer, SAMPLE_BUFF_SIZE);
}

if (CPA_STATUS_SUCCESS == status) {
    /* Build source bufferList */
    pFlatBuffer = (CpaFlatBuffer *)(pBufferListSrc + 1);

    pBufferListSrc->pBuffers = pFlatBuffer;
    pBufferListSrc->numBuffers = 1;
    pBufferListSrc->pPrivateMetaData = pBufferMetaSrc;

    pFlatBuffer->dataLenInBytes = SAMPLE_BUFF_SIZE;
    pFlatBuffer->pData = pSrcBuffer;

    /* Build destination bufferList */
    pFlatBuffer = (CpaFlatBuffer *)(pBufferListDst + 1);

    pBufferListDst->pBuffers = pFlatBuffer;
    pBufferListDst->numBuffers = 1;
    pBufferListDst->pPrivateMetaData = pBufferMetaDst;

    pFlatBuffer->dataLenInBytes = SAMPLE_BUFF_SIZE;
    pFlatBuffer->pData = pDstBuffer;
}

At this point, the application has opened an instance, established a session, and allocated buffers. It is time to start some compress operations. To produce GZIP style compressed files, the first thing that needs to be performed is header generation. Create a header using the following code:

/* Write RFC1952 gzip header to destination buffer */
status = cpaDcGenerateHeader(sessionHdl, pFlatBuffer, &hdr_sz);

if (CPA_STATUS_SUCCESS == status) {
    /* Write out header */
    fwrite(pFlatBuffer->pData, 1, hdr_sz, dstFile);
}

cpaDcGenerateHeader produces a GZIP style header (compliant with GZIP file format specification v4.3, RFC 1952, refer to Related Documents and References) when the session set up data is set such that compType is CPA_DC_DEFLATE and checksum is CPA_DC_CRC32.

Alternatively, a zlib style header (compliant with ZLIB Compressed Data Format Specification, v3.3, RFC 1950, refer to Related Documents and References) can be produced if the session setup data is set such that compType is CPA_DC_DEFLATE and checksum is CPA_DC_ADLER32. This operation demonstrates looping through a file, reading the data, invoking the data compress operation, and writing the results to the output file.

pBufferListSrc->pBuffers->dataLenInBytes = 0;

while ((!feof(srcFile)) && (CPA_STATUS_SUCCESS == status)) {
    /* Read from file into src buffer */
    pBufferListSrc->pBuffers->pData = pSrcBuffer;
    pBufferListSrc->pBuffers->dataLenInBytes += fread(pSrcBuffer + pBufferListSrc->pBuffers->dataLenInBytes, 1, SAMPLE_BUFF_SIZE - pBufferListSrc->pBuffers->dataLenInBytes, srcFile);

    if (pBufferListSrc->pBuffers->dataLenInBytes < SAMPLE_BUFF_SIZE) {
        flush = CPA_DC_FLUSH_FINAL;
    } else {
        flush = CPA_DC_FLUSH_SYNC;
    }

    do {
        PRINT_DBG("cpaDcCompressData\n");
        status = cpaDcCompressData(
                dcInstHandle,
                sessionHdl,
                pBufferListSrc, /* Source buffer list */
                pBufferListDst, /* Destination buffer list */
                &dcResults, /* Results structure */
                flush, /* Stateful session */
                NULL);

        if (CPA_STATUS_SUCCESS != status) {
            PRINT_ERR("cpaDcCompressData failed. (status = %d)\n", status);
            break;
        }

        /* We now check the results */
        if ((dcResults.status != CPA_DC_OK) && (dcResults.status != CPA_DC_OVERFLOW)) {
            PRINT_ERR("Results status not as expected (status = %d)\n", dcResults.status);
            status = CPA_STATUS_FAIL;
            break;
        }

        fwrite(pDstBuffer, 1, dcResults.produced, dstFile);
        if (dcResults.consumed <= pBufferListSrc->pBuffers->dataLenInBytes) {
            pBufferListSrc->pBuffers->dataLenInBytes -= dcResults.consumed;
            pBufferListSrc->pBuffers->pData += dcResults.consumed;
        } else {
            pBufferListSrc->pBuffers->dataLenInBytes = 0;
        }

        if (dcResults.consumed == 0 && pBufferListSrc->pBuffers->dataLenInBytes > 0) {
            memcpy(pSrcBuffer, pBufferListSrc->pBuffers->pData, pBufferListSrc->pBuffers->dataLenInBytes);
            break;
        }
    } while (pBufferListSrc->pBuffers->dataLenInBytes != 0 || dcResults.status == CPA_DC_OVERFLOW);
}

Finally, a GZIP footer is generated. Similar to the call to cpaDcGenerateHeader, a GZIP footer compliant with GZIP style header (compliant with GZIP file format specification v4.3, RFC 1952, refer to Related Documents and References) is produced because the session setup data is set such that compType is CPA_DC_DEFLATE and checksum is CPA_DC_CRC32.

The call to cpaDcGenerateFooter increments the produced field of the CpaDcRqResults structure by the size of the footer added. In this example, the data produced so far has already been written out to the file. As such, the produced field of the CpaDcRqResults structure is cleared before calling the cpaDcGenerateFooter function.

In the event the destination buffer would be too small to accept the footer, the cpaDcGenerateFooter() API will return an invalid parameter error. The cpaDcGenerateFooter() API cannot return an overflow exception. It is the application’s responsibility to ensure that there is enough allocated buffer memory to append the algorithm specific footer.

dcResults.produced = 0;

/* Write RFC1952 gzip footer to destination buffer */
status = cpaDcGenerateFooter(sessionHdl, pFlatBuffer, &dcResults);

if (CPA_STATUS_SUCCESS == status) {
    /* Write out footer */
    fwrite(pFlatBuffer->pData, 1, dcResults.produced, dstFile);
}

Because this session was created with CPA_DC_DIR_COMBINED it can also be used to decompress data.

The stateful decompression operation demonstrates looping through a file, reading the compressed data, invoking the data decompress operation, and writing the results to the output file. In this case, the overflow condition has to be considered. See below example.

pBufferListSrc->pBuffers->dataLenInBytes = 0;

while ((!feof(srcFile)) && (CPA_STATUS_SUCCESS == status))
{
    /* Read from file into src buffer */
    pBufferListSrc->pBuffers->pData = pSrcBuffer;
    pBufferListSrc->pBuffers->dataLenInBytes += fread(pSrcBuffer + pBufferListSrc->pBuffers->dataLenInBytes, 1, SAMPLE_BUFF_SIZE - pBufferListSrc->pBuffers->dataLenInBytes, srcFile);

    if (pBufferListSrc->pBuffers->dataLenInBytes < SAMPLE_BUFF_SIZE) {
        /* FLUSH FINAL flag must be set for last request */
        opData.flushFlag = CPA_DC_FLUSH_FINAL;
    } else {
        /* FLUSH SYNC flag must be set for intermediate requests */
        opData.flushFlag = CPA_DC_FLUSH_SYNC;
    }

    do {
        status = cpaDcDecompressData2(
            dcInstHandle,
            sessionHdl,
            pBufferListSrc, /* Source buffer list */
            pBufferListDst, /* Destination buffer list */
            &opData,
            &dcResults, /* Results structure */
            NULL);

        if (CPA_STATUS_SUCCESS != status) {
            PRINT_ERR("cpaDcDecompressData2 failed. (status = %d)\n", status);
            break;
        }

        /* We now check the results - in decompress direction the
         * output buffer may overflow */
        if ((dcResults.status != CPA_DC_OK) && (dcResults.status != CPA_DC_OVERFLOW)) {
            PRINT_ERR("Results status not as expected (status = %d)\n", dcResults.status);
            status = CPA_STATUS_FAIL;
            break;
        }

        /* The gzip file generated by Deflate algorithm has an 8-byte
         * footer, containing a CRC-32 checksum and the length of the
         * original uncompressed data. The 'endOfLastBlock' flag tells
         * if we have processed the last data block. Break the loop
         * here, otherwise it will keep on reading gzip file */
        if (CPA_TRUE == dcResults.endOfLastBlock) {
            break;
        }

        if (dcResults.consumed <= pBufferListSrc->pBuffers->dataLenInBytes) {
            pBufferListSrc->pBuffers->dataLenInBytes -= dcResults.consumed;
            pBufferListSrc->pBuffers->pData += dcResults.consumed;
        } else {
            pBufferListSrc->pBuffers->dataLenInBytes = 0;
        }

        if (dcResults.consumed == 0 && pBufferListSrc->pBuffers->dataLenInBytes > 0) {
            memcpy(pSrcBuffer, pBufferListSrc->pBuffers->pData, pBufferListSrc->pBuffers->dataLenInBytes);
            break;
        }
    } while (pBufferListSrc->pBuffers->dataLenInBytes != 0 || dcResults.status == CPA_DC_OVERFLOW);
}

Once all operations on this session have been completed, the session is torn down as shown below.

sessionStatus = cpaDcRemoveSession(dcInstHandle, sessionHdl);

Query statistics at this point, which can be useful for debugging.

Finally, clean up by freeing up memory, stopping the instance, etc.

Stateless Data Compression

This example demonstrates the usage of the asynchronous API, specifically using this API to perform a compression operation. It compresses a data buffer through a stateless session using the deflate compress algorithm with dynamic Huffman trees.

The example below compresses a block of data into a compressed block.

These samples are located in /dc/stateless_sample.

In the below example, dynamic Huffman trees are used. The instance can be queried to ensure dynamic Huffman trees are supported, and if an instance-specific buffer is required to perform a dynamic Huffman tree deflate request.

status = cpaDcQueryCapabilities(dcInstHandle, &cap);

if (status != CPA_STATUS_SUCCESS) {
    return status;
}

if (!cap.statelessDeflateCompression || !cap.statelessDeflateDecompression || !cap.checksumAdler32 || !cap.dynamicHuffman) {
    PRINT_DBG("Error: Unsupported functionality\n");
    return CPA_STATUS_FAIL;
}

if (cap.dynamicHuffmanBufferReq) {
    status = cpaDcBufferListGetMetaSize(dcInstHandle, 1, &buffMetaSize);

    if (CPA_STATUS_SUCCESS == status) {
        status = cpaDcGetNumIntermediateBuffers(dcInstHandle, &numInterBuffLists);
    }
    if (CPA_STATUS_SUCCESS == status && 0 != numInterBuffLists) {
        status = PHYS_CONTIG_ALLOC(&bufferInterArray, numInterBuffLists * sizeof(CpaBufferList *));
    }
    for (bufferNum = 0; bufferNum < numInterBuffLists; bufferNum++) {
        if (CPA_STATUS_SUCCESS == status) {
            status = PHYS_CONTIG_ALLOC(&bufferInterArray[bufferNum], sizeof(CpaBufferList));
        }
        if (CPA_STATUS_SUCCESS == status) {
            status = PHYS_CONTIG_ALLOC(&bufferInterArray[bufferNum]->pPrivateMetaData, buffMetaSize);
        }
        if (CPA_STATUS_SUCCESS == status) {
            status = PHYS_CONTIG_ALLOC(&bufferInterArray[bufferNum]->pBuffers, sizeof(CpaFlatBuffer));
        }
        if (CPA_STATUS_SUCCESS == status) {
            /* Implementation requires an intermediate buffer approximately twice the size of the output buffer */
            status = PHYS_CONTIG_ALLOC(&bufferInterArray[bufferNum]->pBuffers->pData, 2 * SAMPLE_MAX_BUFF);
            bufferInterArray[bufferNum]->numBuffers = 1;
            bufferInterArray[bufferNum]->pBuffers->dataLenInBytes = 2 * SAMPLE_MAX_BUFF;
        }
    } /* End numInterBuffLists */
}

if (CPA_STATUS_SUCCESS == status) {
    /* Set the address translation function for the instance */
    status = cpaDcSetAddressTranslation(dcInstHandle, sampleVirtToPhys);
}

if (CPA_STATUS_SUCCESS == status) {
    /* Start DataCompression component */
    PRINT_DBG("cpaDcStartInstance\n");
    status = cpaDcStartInstance(dcInstHandle, numInterBuffLists, bufferInterArray);
}

The create and initialize stateless session demonstrates the sequence of calls to be made to the API to create a session. To establish a session, describing the session must be done followed by determining how much session memory is required, and then invoke the session initialization function cpaDcInitSession. See below example.

sd.compLevel = CPA_DC_L4;
sd.compType = CPA_DC_DEFLATE;
sd.huffType = CPA_DC_HT_FULL_DYNAMIC;

/* If the implementation supports it, the session will be configured
 * to select static Huffman encoding over dynamic Huffman as
 * the static encoding will provide better compressibility */
if (cap.autoSelectBestHuffmanTree) {
    sd.autoSelectBestHuffmanTree = CPA_TRUE;
} else {
    sd.autoSelectBestHuffmanTree = CPA_FALSE;
}

sd.sessDirection = CPA_DC_DIR_COMBINED;
sd.sessState = CPA_DC_STATELESS;
#if (CPA_DC_API_VERSION_NUM_MAJOR == 1 && CPA_DC_API_VERSION_NUM_MINOR < 6)
sd.deflateWindowSize = 7;
#endif
sd.checksum = CPA_DC_ADLER32;

/* Determine size of session context to allocate */
PRINT_DBG("cpaDcGetSessionSize\n");
status = cpaDcGetSessionSize(dcInstHandle, &sd, &sess_size, &ctx_size);

if (CPA_STATUS_SUCCESS == status) {
    /* Allocate session memory */
    status = PHYS_CONTIG_ALLOC(&sessionHdl, sess_size);
}

/* Initialize the Stateless session */
if (CPA_STATUS_SUCCESS == status) {
    PRINT_DBG("cpaDcInitSession\n");
    status = cpaDcInitSession(
            dcInstHandle,
            sessionHdl, /* Session memory */
            &sd, /* Session setup data */
            NULL, /* pContexBuffer not required for stateless operations */
            dcCallback); /* Callback function */
}

Source and destination buffers are allocated in a similar way to the stateful example above.

Perform Operation: This example demonstrates invoking the data compress operation, in the stateless case.

Once all operations on this session have been completed, the session is torn down as shown below.

sessionStatus = cpaDcRemoveSession(dcInstHandle, sessionHdl);

Stateless Data Compression Using Multiple Compress Operations

This example demonstrates the use of the asynchronous API, specifically, using this API to perform a compression operation. It compresses a data buffer using multiple stateless compression API requests and maintains length and checksum information across the multiple requests without the overhead of maintaining full history information as used in a stateful operation.

The samples are located in /dc/stateless_multi_op_checksum_sample.

In this sample, session creation is the same as for regular stateless operation. Refer to the previous sample for details.

Perform Operation: This example demonstrates the invoking of the data compress operation in the stateless case while maintaining checksum information across multiple compress operations. The key points to note are:

  • The initial value of dcResults.checksum is set to 0 for CRC32 or set to 1 for Adler32 when invoking the first compress or decompress operation for a data set.

    if (sd.checksum == CPA_DC_ADLER32) {
        /* Initialize checksum to 1 for Adler32 */
        dcResults.checksum = 1;
    } else {
        /* Initialize checksum to 0 for CRC32 */
        dcResults.checksum = 0;
    }
    
  • The value of dcResults.checksum when invoking a subsequent compress operation for a data set is set to the dcResults. Checksum value returned from the previous compress operation on that data set.

Data Compression Data Plane API

This example demonstrates the usage of the data plane data compression API to perform a compression operation. It compresses a data buffer via a stateless session using the deflate compress algorithm with dynamic Huffman trees. This example is simplified to demonstrate the basics of how to use the API and how to build the structures required. This example does not demonstrate the optimal way to use the API to get maximum performance for a particular implementation. Refer to Related Documents and References for implementation specific documentation and performance sample code for a guide on how to use the API for best performance.

These samples are located in /dc/dc_dp_sample.

The data plane data compression API is used in a similar way to the data plane symmetric cryptographic API:

  • Data compression service instances are queried and started in the same way and using the same functions as before (see example in Instance Discovery).

The below example registers a callback function for the data compression instance.

status = cpaDcDpRegCbFunc(dcInstHandle, dcDpCallback);

Next, create and initialize a session a shown below.

if (CPA_STATUS_SUCCESS == status) {
    sd.compLevel = CPA_DC_L4;
    sd.compType = CPA_DC_DEFLATE;
    sd.huffType = CPA_DC_HT_FULL_DYNAMIC;

    /* If the implementation supports it, the session will be configured
     * to select static Huffman encoding over dynamic Huffman as
     * the static encoding will provide better compressibility */
    if (cap.autoSelectBestHuffmanTree) {
        sd.autoSelectBestHuffmanTree = CPA_TRUE;
    } else {
        sd.autoSelectBestHuffmanTree = CPA_FALSE;
    }

    sd.sessDirection = CPA_DC_DIR_COMBINED;
    sd.sessState = CPA_DC_STATELESS;
    #if (CPA_DC_API_VERSION_NUM_MAJOR == 1 && CPA_DC_API_VERSION_NUM_MINOR < 6)
    sd.deflateWindowSize = 7;
    #endif
    sd.checksum = CPA_DC_CRC32;

    /* Determine size of session context to allocate */
    PRINT_DBG("cpaDcGetSessionSize\n");
    status = cpaDcGetSessionSize(dcInstHandle, &sd, &sess_size, &ctx_size);
}

if (CPA_STATUS_SUCCESS == status) {
    /* Allocate session memory */
    status = PHYS_CONTIG_ALLOC(&sessionHdl, sess_size);
}

/* Initialize the Stateless session */
if (CPA_STATUS_SUCCESS == status) {
    PRINT_DBG("cpaDcDpInitSession\n");
    status = cpaDcDpInitSession(
            dcInstHandle,
            sessionHdl, /* Session memory */
            &sd); /* Session setup data */
}

In the following example, input and output data is stored in a scatter-gather list. The source and destination buffers are described using the CpaPhysBufferList structure. In this example the allocation (which needs to be 8-byte aligned) and setup of the source buffer is shown. The destination buffers can be allocated and set up in a similar way.

numBuffers = 2;

/* Size of CpaPhysBufferList and array of CpaPhysFlatBuffers */
bufferListMemSize = sizeof(CpaPhysBufferList) + (numBuffers * sizeof(CpaPhysFlatBuffer));

/* Allocate 8-byte aligned source buffer List */
status = PHYS_CONTIG_ALLOC_ALIGNED(&pBufferListSrc, bufferListMemSize, 8);

if (CPA_STATUS_SUCCESS == status) {
    /* Allocate first data buffer to hold half the data */
    status = PHYS_CONTIG_ALLOC(&pSrcBuffer, (sizeof(sampleData)) / 2);
}

if (CPA_STATUS_SUCCESS == status) {
    /* Allocate second data buffer to hold half the data */
    status = PHYS_CONTIG_ALLOC(&pSrcBuffer2, (sizeof(sampleData)) / 2);
}

if (CPA_STATUS_SUCCESS == status) {
    /* Copy source into buffer */
    memcpy(pSrcBuffer, sampleData, sizeof(sampleData) / 2);
    memcpy(pSrcBuffer2, &(sampleData[sizeof(sampleData) / 2]), sizeof(sampleData) / 2);
    /* Build source bufferList */
    pBufferListSrc->numBuffers = 2;
    pBufferListSrc->flatBuffers[0].dataLenInBytes = sizeof(sampleData) / 2;
    pBufferListSrc->flatBuffers[0].bufferPhysAddr = sampleVirtToPhys(pSrcBuffer);
    pBufferListSrc->flatBuffers[1].dataLenInBytes = sizeof(sampleData) / 2;
    pBufferListSrc->flatBuffers[1].bufferPhysAddr = sampleVirtToPhys(pSrcBuffer2);
}

The operational data in this case is shown below.

/* Allocate memory for operational data. Note this needs to be
 * 8-byte aligned, contiguous, resident in DMA-accessible
 * memory */
status = PHYS_CONTIG_ALLOC_ALIGNED(&pOpData, sizeof(CpaDcDpOpData), 8);

if (CPA_STATUS_SUCCESS == status) {
    pOpData->bufferLenToCompress = sizeof(sampleData);
    pOpData->bufferLenForData = sizeof(sampleData);
    pOpData->dcInstance = dcInstHandle;
    pOpData->pSessionHandle = sessionHdl;
    pOpData->srcBuffer = sampleVirtToPhys(pBufferListSrc);
    pOpData->srcBufferLen = CPA_DP_BUFLIST;
    pOpData->destBuffer = sampleVirtToPhys(pBufferListDst);
    pOpData->destBufferLen = CPA_DP_BUFLIST;
    pOpData->sessDirection = CPA_DC_DIR_COMPRESS;
    pOpData->thisPhys = sampleVirtToPhys(pOpData);
    pOpData->pCallbackTag = (void *)0;
}

This request is then enqueued and submitted on the instance as shown below.

status = cpaDcDpEnqueueOp(pOpData, CPA_TRUE);

After possibly doing other work (e.g., enqueuing and submitting more requests), the application can poll for responses that invoke the callback function registered with the instance. Refer to Related Documents and References for implementation specific documentation on the implementations polling functions.

Once all requests associated with a session have been completed, the session can be removed.

Clean up by freeing up memory, stopping the instance, etc. as follows:

sessionStatus = cpaDcDpRemoveSession(dcInstHandle, sessionHdl);

Chained Hash and Stateless Compression

This example demonstrates the use of the asynchronous API, specifically, using the data compression chain API to perform chained hash and stateless compression operations. It performs a SHA-256 hash on the sample text and then compresses the sample text through a stateless session using the deflate compress algorithm with static Huffman trees.

These samples are located in /dc/chaining_sample.

The below example shows how to query and start a compression instance.

/* In this simplified version of instance discovery, we discover
 * exactly one instance of a data compression service */
sampleDcGetInstance(&dcInstHandle);

if (dcInstHandle == NULL) {
    PRINT_ERR("Get instance failed\n");
    return CPA_STATUS_FAIL;
}

/* Query Capabilities */
PRINT_DBG("cpaDcQueryCapabilities\n");
status = cpaDcQueryCapabilities(dcInstHandle, &cap);

if (status != CPA_STATUS_SUCCESS) {
    PRINT_ERR("Query capabilities failed\n");
    return status;
}

if (CPA_FALSE == CPA_BITMAP_BIT_TEST(cap.dcChainCapInfo, CPA_DC_CHAIN_HASH_THEN_COMPRESS)) {
    PRINT_ERR("Hash + compress chained operation is not supported on logical instance.\n");
    PRINT_ERR("Please ensure Chaining related settings are enabled in the device configuration file.\n");
    return CPA_STATUS_FAIL;
    }

if (!cap.statelessDeflateCompression || !cap.checksumCRC32 || !cap.checksumAdler32) {
    PRINT_ERR("Error: Unsupported functionality\n");
    return CPA_STATUS_FAIL;
}

if (CPA_STATUS_SUCCESS == status) {
    /* Set the address translation function for the instance */
    status = cpaDcSetAddressTranslation(dcInstHandle, sampleVirtToPhys);
}

if (CPA_STATUS_SUCCESS == status) {
    /* Start static data compression component */
    PRINT_DBG("cpaDcStartInstance\n");
    status = cpaDcStartInstance(dcInstHandle, 0, NULL);
}

This example shows how to create and initialize the session.

if (CPA_STATUS_SUCCESS == status) {
    /* If the instance is polled start the polling thread. Note that
     * how the polling is done is implementation-dependant */
    sampleDcStartPolling(dcInstHandle);

    /* We now populate the fields of the session operational data and
     * create the session. Note that the size required to store a session is
     * implementation-dependent, so we query the API first to determine
     * how much memory to allocate, and then allocate that memory */

    // <snippet name="initSession">

    /* Initialize compression session data */
    dcSessionData.compLevel = CPA_DC_L1;
    dcSessionData.compType = CPA_DC_DEFLATE;
    dcSessionData.huffType = CPA_DC_HT_STATIC;
    dcSessionData.autoSelectBestHuffmanTree = CPA_FALSE;
    dcSessionData.sessDirection = CPA_DC_DIR_COMPRESS;
    dcSessionData.sessState = CPA_DC_STATELESS;
    dcSessionData.checksum = CPA_DC_CRC32;

    /* Initialize crypto session data */
    cySessionData.sessionPriority = CPA_CY_PRIORITY_NORMAL;

    /* Hash operation on the source data */
    cySessionData.symOperation = CPA_CY_SYM_OP_HASH;
    cySessionData.hashSetupData.hashAlgorithm = CPA_CY_SYM_HASH_SHA256;
    cySessionData.hashSetupData.hashMode = CPA_CY_SYM_HASH_MODE_PLAIN;
    cySessionData.hashSetupData.digestResultLenInBytes = GET_HASH_DIGEST_LENGTH(cySessionData.hashSetupData.hashAlgorithm);

    /* Place the digest result in a buffer unrelated to srcBuffer */
    cySessionData.digestIsAppended = CPA_FALSE;

    /* Generate the digest */
    cySessionData.verifyDigest = CPA_FALSE;

    /* Initialize chaining session data - hash + compression chain operation */
    chainSessionData[0].sessType = CPA_DC_CHAIN_SYMMETRIC_CRYPTO;
    chainSessionData[0].pCySetupData = &cySessionData;
    chainSessionData[1].sessType = CPA_DC_CHAIN_COMPRESS_DECOMPRESS;
    chainSessionData[1].pDcSetupData = &dcSessionData;

    /* Determine size of session context to allocate */
    PRINT_DBG("cpaDcChainGetSessionSize\n");
    status = cpaDcChainGetSessionSize(dcInstHandle, CPA_DC_CHAIN_HASH_THEN_COMPRESS, NUM_SESSIONS_TWO, chainSessionData, &sess_size);
}

if (CPA_STATUS_SUCCESS == status) {
    /* Allocate session memory */
    status = PHYS_CONTIG_ALLOC(&sessionHdl, sess_size);
}

/* Initialize the chaining session */
if (CPA_STATUS_SUCCESS == status) {
    PRINT_DBG("cpaDcChainInitSession\n");
    status = cpaDcChainInitSession(dcInstHandle, sessionHdl, CPA_DC_CHAIN_HASH_THEN_COMPRESS, NUM_SESSIONS_TWO, chainSessionData, dcCallback);
}

Note

cySessionData.digestIsAppended should be always set to CPA_FALSE as the digest must not appended at the end of output.

The next example shows the memory allocation for the chained hash and stateless compression.

status = cpaDcBufferListGetMetaSize(dcInstHandle, numBuffers, &bufferMetaSize);

if (CPA_STATUS_SUCCESS != status) {
    PRINT_ERR("Error get meta size\n");
    return CPA_STATUS_FAIL;
}

bufferSize = sampleDataSize;

if (CPA_STATUS_SUCCESS == status) {
    status = dcChainBuildBufferList(&pBufferListSrc, numBuffers, bufferSize, bufferMetaSize);
}

/* copy source data into buffer */
if (CPA_STATUS_SUCCESS == status) {
    pFlatBuffer = (CpaFlatBuffer *)(pBufferListSrc + 1);
    memcpy(pFlatBuffer->pData, sampleData, bufferSize);
}

/* Allocate destination buffer the four times as source buffer */
if (CPA_STATUS_SUCCESS == status) {
    status = dcChainBuildBufferList(&pBufferListDst, numBuffers, 4 * bufferSize, bufferMetaSize);
}

/* Allocate digest result buffer to store hash value */
if (CPA_STATUS_SUCCESS == status) {
    status = PHYS_CONTIG_ALLOC(&pDigestBuffer, GET_HASH_DIGEST_LENGTH(hashAlg));
}

The following example sets up the operational data.

dcOpData.flushFlag = CPA_DC_FLUSH_FINAL;
dcOpData.compressAndVerify = CPA_TRUE;
dcOpData.compressAndVerifyAndRecover = CPA_TRUE;
cySymOpData.packetType = CPA_CY_SYM_PACKET_TYPE_FULL;
cySymOpData.hashStartSrcOffsetInBytes = 0;
cySymOpData.messageLenToHashInBytes = bufferSize;
cySymOpData.pDigestResult = pDigestBuffer;

/* Set chaining operation data */
chainOpData[0].opType = CPA_DC_CHAIN_SYMMETRIC_CRYPTO;
chainOpData[0].pCySymOp = &cySymOpData;
chainOpData[1].opType = CPA_DC_CHAIN_COMPRESS_DECOMPRESS;
chainOpData[1].pDcOp = &dcOpData;

Hash and stateless dynamic compression are also supported. Refer to the previous examples to add dynamic compression released buffers and session data.

Note

Hash algorithms are not limited to SHA-1 and SHA-256. Refer to Related Documents and References (Intel® QuickAssist Technology Software for Linux Release Notes*) for any limitations on using other hash algorithms in the current release.

The following example shows how to verify the output of the chained hash and stateless compression.

/* Use software to calculate digest and verify digest */
if (CPA_STATUS_SUCCESS == status) {
    PHYS_CONTIG_ALLOC(&pSWDigestBuffer, GET_HASH_DIGEST_LENGTH(hashAlg));
    status = calSWDigest(sampleData, bufferSize, pSWDigestBuffer, GET_HASH_DIGEST_LENGTH(hashAlg), hashAlg);

    if (CPA_STATUS_SUCCESS == status) {
        if (memcmp(pDigestBuffer, pSWDigestBuffer, GET_HASH_DIGEST_LENGTH(hashAlg))) {
            status = CPA_STATUS_FAIL;
            PRINT_ERR("Digest buffer does not match expected output\n");
        } else {
            PRINT_DBG("Digest buffer matches expected output\n");
        }
    }

    PHYS_CONTIG_FREE(pSWDigestBuffer);
}

/* Use zlib to decompress and verify integrity */

//<snippet name="software decompress">

if (CPA_STATUS_SUCCESS == status) {
    struct z_stream_s stream = {0};
    Cpa8U *pDecompBuffer = NULL;
    Cpa8U *pHWCompBuffer = NULL;
    Cpa8U *pSWCompBuffer = NULL;
    Cpa32U bufferLength = 0;
    status = inflate_init(&stream);

    if (CPA_STATUS_SUCCESS != status) {
        PRINT("zlib stream initialize failed");
    }

    bufferLength = pBufferListSrc->numBuffers * pBufferListSrc->pBuffers->dataLenInBytes;

    if (CPA_STATUS_SUCCESS == status) {
        status = PHYS_CONTIG_ALLOC(&pDecompBuffer, bufferLength);
    }

    if (CPA_STATUS_SUCCESS == status) {
        status = PHYS_CONTIG_ALLOC(&pHWCompBuffer, bufferLength);
    }

    if (CPA_STATUS_SUCCESS == status) {
        status = PHYS_CONTIG_ALLOC(&pSWCompBuffer, bufferLength);
    }

    if (CPA_STATUS_SUCCESS == status) {
        copyMultiFlatBufferToBuffer(pBufferListDst, pHWCompBuffer);
    }

    if (CPA_STATUS_SUCCESS == status) {
        status = inflate_decompress(&stream, pHWCompBuffer, bufferLength, pDecompBuffer, bufferLength);

        if (CPA_STATUS_SUCCESS != status) {
            PRINT_ERR("Decompress data on zlib stream failed\n");
        }
    }

    if (CPA_STATUS_SUCCESS == status) {
        /* Compare with original Src buffer */
        if (memcmp(pDecompBuffer, sampleData, bufferSize)) {
            status = CPA_STATUS_FAIL;
            PRINT_ERR("Decompression does not match source buffer\n");
        } else {
            PRINT_DBG("Decompression matches source buffer\n");
        }
    }

    inflate_destroy(&stream);
    PHYS_CONTIG_FREE(pSWCompBuffer);
    PHYS_CONTIG_FREE(pHWCompBuffer);
    PHYS_CONTIG_FREE(pDecompBuffer);
}

Once all operations on this session have been completed, the session is torn down as shown below.

status = cpaDcChainRemoveSession(dcInstHandle, sessionHdl);