neural_compressor.onnxrt.quantization.algorithm_entry
Module Contents
Functions
|
Apply smooth quant. |
|
The main entry to apply rtn quantization. |
|
The main entry to apply gptq quantization. |
|
The main entry to apply awq quantization. |
- neural_compressor.onnxrt.quantization.algorithm_entry.smooth_quant_entry(model: pathlib.Path | str, quant_config: neural_compressor.onnxrt.quantization.config.SmoohQuantConfig, calibration_data_reader: neural_compressor.onnxrt.quantization.calibrate.CalibrationDataReader, *args, **kwargs) onnx.ModelProto [source]
Apply smooth quant.
- neural_compressor.onnxrt.quantization.algorithm_entry.rtn_quantize_entry(model: pathlib.Path | str, quant_config: neural_compressor.onnxrt.quantization.config.RTNConfig, *args, **kwargs) onnx.ModelProto [source]
The main entry to apply rtn quantization.
- neural_compressor.onnxrt.quantization.algorithm_entry.gptq_quantize_entry(model: pathlib.Path | str, quant_config: neural_compressor.onnxrt.quantization.config.GPTQConfig, calibration_data_reader: neural_compressor.onnxrt.quantization.calibrate.CalibrationDataReader, *args, **kwargs) onnx.ModelProto [source]
The main entry to apply gptq quantization.
- neural_compressor.onnxrt.quantization.algorithm_entry.awq_quantize_entry(model: pathlib.Path | str, quant_config: neural_compressor.onnxrt.quantization.config.AWQConfig, calibration_data_reader: neural_compressor.onnxrt.quantization.calibrate.CalibrationDataReader, *args, **kwargs) onnx.ModelProto [source]
The main entry to apply awq quantization.