neural_compressor.adaptor.onnxrt

Module Contents

Classes

ONNXRUNTIMEAdaptor

The ONNXRT adaptor layer, do onnx-rt quantization, calibration, inspect layer tensors.

ONNXRT_WeightOnlyAdaptor

The ONNXRT adaptor layer, do onnx-rt quantization, calibration, inspect layer tensors.

ONNXRT_QLinearOpsAdaptor

The ONNXRT adaptor layer, do onnx-rt quantization, calibration, inspect layer tensors.

ONNXRT_IntegerOpsAdaptor

The ONNXRT adaptor layer, do onnx-rt quantization, calibration, inspect layer tensors.

ONNXRT_QDQAdaptor

The ONNXRT adaptor layer, do onnx-rt quantization, calibration, inspect layer tensors.

class neural_compressor.adaptor.onnxrt.ONNXRUNTIMEAdaptor(framework_specific_info)[source]

The ONNXRT adaptor layer, do onnx-rt quantization, calibration, inspect layer tensors.

Parameters:

framework_specific_info (dict) – framework specific configuration for quantization.

class neural_compressor.adaptor.onnxrt.ONNXRT_WeightOnlyAdaptor(framework_specific_info)[source]

The ONNXRT adaptor layer, do onnx-rt quantization, calibration, inspect layer tensors.

Parameters:

framework_specific_info (dict) – framework specific configuration for quantization.

class neural_compressor.adaptor.onnxrt.ONNXRT_QLinearOpsAdaptor(framework_specific_info)[source]

The ONNXRT adaptor layer, do onnx-rt quantization, calibration, inspect layer tensors.

Parameters:

framework_specific_info (dict) – framework specific configuration for quantization.

class neural_compressor.adaptor.onnxrt.ONNXRT_IntegerOpsAdaptor(framework_specific_info)[source]

The ONNXRT adaptor layer, do onnx-rt quantization, calibration, inspect layer tensors.

Parameters:

framework_specific_info (dict) – framework specific configuration for quantization.

class neural_compressor.adaptor.onnxrt.ONNXRT_QDQAdaptor(framework_specific_info)[source]

The ONNXRT adaptor layer, do onnx-rt quantization, calibration, inspect layer tensors.

Parameters:

framework_specific_info (dict) – framework specific configuration for quantization.