neural_compressor.torch.quantization.quantize
Intel Neural Compressor Pytorch quantization base API.
Functions
|
Check whether to apply this algorithm according to configs_mapping. |
|
The main entry to quantize model with static mode. |
|
Prepare the model for calibration. |
|
Convert the prepared model to a quantized model. |
|
Generate and save calibration info. |
Module Contents
- neural_compressor.torch.quantization.quantize.need_apply(configs_mapping: Dict[Tuple[str, callable], neural_compressor.common.base_config.BaseConfig], algo_name)[source]
Check whether to apply this algorithm according to configs_mapping.
- Parameters:
configs_mapping (Dict[Tuple[str, callable], BaseConfig]) – configs mapping
algo_name (str) – algo name
- Returns:
True or False.
- Return type:
Bool
- neural_compressor.torch.quantization.quantize.quantize(model: torch.nn.Module, quant_config: neural_compressor.common.base_config.BaseConfig, run_fn: Callable = None, run_args: Any = None, inplace: bool = True, example_inputs: Any = None) torch.nn.Module [source]
The main entry to quantize model with static mode.
- Parameters:
model – a float model to be quantized.
quant_config – a quantization configuration.
run_fn – a calibration function for calibrating the model. Defaults to None.
run_args – positional arguments for run_fn. Defaults to None.
example_inputs – used to trace torch model.
- Returns:
The quantized model.
- neural_compressor.torch.quantization.quantize.prepare(model: torch.nn.Module, quant_config: neural_compressor.common.base_config.BaseConfig, inplace: bool = True, example_inputs: Any = None)[source]
Prepare the model for calibration.
Insert observers into the model so that it can monitor the input and output tensors during calibration.
- Parameters:
model (torch.nn.Module) – origin model
quant_config (BaseConfig) – path to quantization config
inplace (bool, optional) – It will change the given model in-place if True.
example_inputs (tensor/tuple/dict, optional) – used to trace torch model.
- Returns:
prepared and calibrated module.
- neural_compressor.torch.quantization.quantize.convert(model: torch.nn.Module, quant_config: neural_compressor.common.base_config.BaseConfig = None, inplace: bool = True)[source]
Convert the prepared model to a quantized model.
- Parameters:
model (torch.nn.Module) – torch model
quant_config (BaseConfig, optional) – path to quantization config, only required when model is not prepared.
inplace (bool, optional) – It will change the given model in-place if True.
- Returns:
The quantized model.