neural_compressor.experimental.quantization

Neural Compressor Quantization API.

Module Contents

Classes

Quantization

This class provides easy use API for quantization.

class neural_compressor.experimental.quantization.Quantization(conf_fname_or_obj=None)

Bases: neural_compressor.experimental.component.Component

This class provides easy use API for quantization.

It automatically searches for optimal quantization recipes for low precision model inference, achieving best tuning objectives like inference performance within accuracy loss constraints. Tuner abstracts out the differences of quantization APIs across various DL frameworks and brings a unified API for automatic quantization that works on frameworks including tensorflow, pytorch and mxnet. Since DL use cases vary in the accuracy metrics (Top-1, MAP, ROC etc.), loss criteria (<1% or <0.1% etc.) and tuning objectives (performance, memory footprint etc.). Tuner class provides a flexible configuration interface via YAML for users to specify these parameters.

Parameters:

conf_fname_or_obj (string or obj) – The path to the YAML configuration file or QuantConf class containing accuracy goal, tuning objective and preferred calibration & quantization tuning space etc.

property calib_dataloader

Get calib_dataloader attribute.

property metric

Get metric attribute.

property objective

Get objective attribute.

property postprocess

Get postprocess attribute.

property q_func

Get q_func attribute.

property model

Override model getter method to handle quantization aware training case.

pre_process()

Prepare dataloaders, qfuncs for Component.

execute()

Quantization execute routinue based on strategy design.

dataset(dataset_type, *args, **kwargs)

Get dataset according to dataset_type.