neural_compressor.tensorflow.quantization.quantize

Intel Neural Compressor Tensorflow quantization base API.

Functions

`need_apply`(configs_mapping, algo_name)	Whether to apply the algorithm.
`quantize_model`(model, quant_config[, ...])	The main entry to quantize model.
`quantize_model_with_single_config`(q_model, quant_config)	Quantize model using single config.

Module Contents

neural_compressor.tensorflow.quantization.quantize.need_apply(configs_mapping: Dict[Tuple[str, callable], neural_compressor.common.base_config.BaseConfig], algo_name)[source]: Whether to apply the algorithm.

neural_compressor.tensorflow.quantization.quantize.quantize_model(model: str | tensorflow.keras.Model | neural_compressor.tensorflow.utils.BaseModel, quant_config: neural_compressor.common.base_config.BaseConfig | list, calib_dataloader: Callable = None, calib_iteration: int = 100, calib_func: Callable = None)[source]

The main entry to quantize model.

Parameters:

model – a fp32 model to be quantized.
quant_config – single or lists of quantization configuration.
calib_dataloader – a data loader for calibration.
calib_iteration – the iteration of calibration.
calib_func – the function used for calibration, should be a substitution for calib_dataloader
inference. (when the built-in calibration function of INC does not work for model)

Returns:

the quantized model.

Return type:

q_model

neural_compressor.tensorflow.quantization.quantize.quantize_model_with_single_config(q_model: neural_compressor.tensorflow.utils.BaseModel, quant_config: neural_compressor.common.base_config.BaseConfig, calib_dataloader: Callable = None, calib_iteration: int = 100, calib_func: Callable = None)[source]

Quantize model using single config.

Parameters:

model – a model wrapped by INC TF model class.
quant_config – a quantization configuration.
calib_dataloader – a data loader for calibration.
calib_iteration – the iteration of calibration.
calib_func – the function used for calibration, should be a substitution for calib_dataloader
inference. (when the built-in calibration function of INC does not work for model)

Returns:

the quantized model.

Return type:

q_model