API Documentation

Introduction

Intel® Neural Compressor is an open-source Python library designed to help users quickly deploy low-precision inference solutions on popular deep learning (DL) frameworks such as TensorFlow*, PyTorch*, MXNet, and ONNX Runtime. It automatically optimizes low-precision recipes for deep learning models in order to achieve optimal product objectives, such as inference performance and memory usage, with expected accuracy criteria.

User-facing APIs

These APIs are intended to unify low-precision quantization interfaces cross multiple DL frameworks for the best out-of-the-box experiences.

Note

Neural Compressor is continuously improving user-facing APIs to create a better user experience.

Two sets of user-facing APIs exist. One is the default one supported from Neural Compressor v1.0 for backwards compatibility. The other set consists of new APIs in the neural_compressor.experimental package.

We recommend that you use the APIs located in neural_compressor.experimental. All examples have been updated to use the experimental APIs.

The major differences between the default user-facing APIs and the experimental APIs are:

  1. The experimental APIs abstract the neural_compressor.experimental.common.Model concept to cover those cases whose weight and graph files are stored separately.

  2. The experimental APIs unify the calling style of the Quantization, Pruning, and Benchmark classes by setting model, calibration dataloader, evaluation dataloader, and metric through class attributes rather than passing them as function inputs.

  3. The experimental APIs refine Neural Compressor built-in transforms/datasets/metrics by unifying the APIs cross different framework backends.

Experimental user-facing APIs

Experimental user-facing APIs consist of the following components:

Default user-facing APIs

The default user-facing APIs exist for backwards compatibility from the v1.0 release. Refer to v1.1 API to understand how the default user-facing APIs work.

View the HelloWorld example that uses default user-facing APIs for user reference.

Full examples using default user-facing APIs can be found here.