neural_compressor.tensorflow.quantization.algorithm_entry

The entry interface for algorithms.

Functions

static_quant_entry(model, quant_config[, ...])

The main entry to apply static quantization.

smooth_quant_entry(model, smooth_quant_config[, ...])

The main entry to apply smooth quantization.

Module Contents

neural_compressor.tensorflow.quantization.algorithm_entry.static_quant_entry(model: neural_compressor.tensorflow.utils.BaseModel, quant_config: neural_compressor.common.base_config.BaseConfig, calib_dataloader: Callable = None, calib_iteration: int = 100, calib_func: Callable = None)[source]

The main entry to apply static quantization.

Parameters:
  • model – a fp32 model to be quantized.

  • quant_config – a quantization configuration.

  • calib_dataloader – a data loader for calibration.

  • calib_iteration – the iteration of calibration.

  • calib_func – the function used for calibration, should be a substitution for calib_dataloader

  • inference. (when the built-in calibration function of INC does not work for model)

Returns:

the quantized model.

Return type:

q_model

neural_compressor.tensorflow.quantization.algorithm_entry.smooth_quant_entry(model: neural_compressor.tensorflow.utils.BaseModel, smooth_quant_config: neural_compressor.tensorflow.quantization.config.SmoothQuantConfig, calib_dataloader: Callable = None, calib_iteration: int = 100, calib_func: Callable = None)[source]

The main entry to apply smooth quantization.

Parameters:
  • model – a fp32 model to be quantized.

  • quant_config – a quantization configuration.

  • calib_dataloader – a data loader for calibration.

  • calib_iteration – the iteration of calibration.

  • calib_func – the function used for calibration, should be a substitution for calib_dataloader

  • inference. (when the built-in calibration function of INC does not work for model)

Returns:

the quantized model.

Return type:

q_model