neural_compressor.utils.pytorch

Pytorch utilities.

Module Contents

Functions

is_int8_model(model)

Check whether the input model is a int8 model.

load_weight_only(checkpoint_dir, model)

Load model in weight_only mode.

load([checkpoint_dir, model, history_cfg])

Execute the quantize process on the specified model.

neural_compressor.utils.pytorch.is_int8_model(model)[source]

Check whether the input model is a int8 model.

Parameters:

model (torch.nn.Module) – input model

Returns:

Return True if the input model is a int8 model.

Return type:

result(bool)

neural_compressor.utils.pytorch.load_weight_only(checkpoint_dir, model)[source]

Load model in weight_only mode.

Parameters:
  • checkpoint_dir (dir/file/dict) – The folder of checkpoint. ‘qconfig.json’ and ‘best_model.pt’ are needed in This directory. ‘checkpoint’ dir is under workspace folder and workspace folder is define in configure yaml file.

  • model (object) – fp32 model need to do quantization.

Returns:

quantized model

Return type:

(object)

neural_compressor.utils.pytorch.load(checkpoint_dir=None, model=None, history_cfg=None, **kwargs)[source]

Execute the quantize process on the specified model.

Parameters:
  • checkpoint_dir (dir/file/dict) – The folder of checkpoint. ‘best_configure.yaml’ and ‘best_model_weights.pt’ are needed in This directory. ‘checkpoint’ dir is under workspace folder and workspace folder is define in configure yaml file.

  • model (object) – fp32 model need to do quantization.

  • history_cfg (object) – configurations from history.snapshot file.

  • **kwargs (dict) – contains customer config dict and etc.

Returns:

quantized model

Return type:

(object)