neural_compressor.onnxrt.quantization

Package Contents

Classes

CalibrationDataReader

Get data for calibration.

RTNConfig

Config class for round-to-nearest weight-only quantization.

GPTQConfig

Config class for gptq weight-only quantization.

AWQConfig

Config class for awq weight-only quantization.

SmoohQuantConfig

Smooth quant quantization config.

Functions

smooth_quant_entry(→ onnx.ModelProto)

Apply smooth quant.

rtn_quantize_entry(→ onnx.ModelProto)

The main entry to apply rtn quantization.

gptq_quantize_entry(→ onnx.ModelProto)

The main entry to apply gptq quantization.

awq_quantize_entry(→ onnx.ModelProto)

The main entry to apply awq quantization.

get_default_rtn_config(→ RTNConfig)

Generate the default rtn config.

get_default_gptq_config(→ GPTQConfig)

Generate the default gptq config.

get_default_awq_config(→ AWQConfig)

Generate the default awq config.

get_default_sq_config(→ SmoohQuantConfig)

Generate the default smooth quant config.

autotune

get_all_config_set(...)