neural_compressor.tensorflow.quantization.algorithm_entry
The entry interface for algorithms.
Functions
|
The main entry to apply static quantization. |
|
The main entry to apply smooth quantization. |
Module Contents
- neural_compressor.tensorflow.quantization.algorithm_entry.static_quant_entry(model: neural_compressor.tensorflow.utils.BaseModel, quant_config: neural_compressor.common.base_config.BaseConfig, calib_dataloader: Callable = None, calib_iteration: int = 100, calib_func: Callable = None)[source]
The main entry to apply static quantization.
- Parameters:
model – a fp32 model to be quantized.
quant_config – a quantization configuration.
calib_dataloader – a data loader for calibration.
calib_iteration – the iteration of calibration.
calib_func – the function used for calibration, should be a substitution for calib_dataloader
inference. (when the built-in calibration function of INC does not work for model)
- Returns:
the quantized model.
- Return type:
q_model
- neural_compressor.tensorflow.quantization.algorithm_entry.smooth_quant_entry(model: neural_compressor.tensorflow.utils.BaseModel, smooth_quant_config: neural_compressor.tensorflow.quantization.config.SmoothQuantConfig, calib_dataloader: Callable = None, calib_iteration: int = 100, calib_func: Callable = None)[source]
The main entry to apply smooth quantization.
- Parameters:
model – a fp32 model to be quantized.
quant_config – a quantization configuration.
calib_dataloader – a data loader for calibration.
calib_iteration – the iteration of calibration.
calib_func – the function used for calibration, should be a substitution for calib_dataloader
inference. (when the built-in calibration function of INC does not work for model)
- Returns:
the quantized model.
- Return type:
q_model