neural_compressor.torch.utils.utility

Intel Neural Compressor PyTorch utilities.

Functions

register_algo(name)

Decorator function to register algorithms in the algos_mapping dictionary.

fetch_module(model, op_name)

Get module with a given op name.

set_module(model, op_name, new_module)

Set module with a given op name.

get_model_info(→ List[Tuple[str, str]])

Get model info according to white_module_list.

get_double_quant_config_dict([double_quant_type])

Query config dict of double_quant according to double_quant_type.

get_quantizer(model, quantizer_cls[, quant_config])

Get the quantizer.

postprocess_model(model, mode, quantizer)

Process quantizer attribute of model according to current phase.

dump_model_op_stats(mode, tune_cfg)

Dump quantizable ops stats of model to user.

get_model_device(model)

Get the device.

get_processor_type_from_user_config([user_processor_type])

Get the processor type.

dowload_hf_model(repo_id[, cache_dir, repo_type, revision])

Download hugging face model from hf hub.

load_empty_model(pretrained_model_name_or_path[, cls])

Load a empty model.

Module Contents

neural_compressor.torch.utils.utility.register_algo(name)[source]

Decorator function to register algorithms in the algos_mapping dictionary.

Usage example:

@register_algo(name=example_algo) def example_algo(model: torch.nn.Module, quant_config: RTNConfig) -> torch.nn.Module:

Parameters:

name (str) – The name under which the algorithm function will be registered.

Returns:

The decorator function to be used with algorithm functions.

Return type:

decorator

neural_compressor.torch.utils.utility.fetch_module(model, op_name)[source]

Get module with a given op name.

Parameters:
  • model (object) – the input model.

  • op_name (str) – name of op.

Returns:

module (object).

neural_compressor.torch.utils.utility.set_module(model, op_name, new_module)[source]

Set module with a given op name.

Parameters:
  • model (object) – the input model.

  • op_name (str) – name of op.

  • new_module (object) – the input model.

Returns:

module (object).

neural_compressor.torch.utils.utility.get_model_info(model: torch.nn.Module, white_module_list: List[Callable]) List[Tuple[str, str]][source]

Get model info according to white_module_list.

neural_compressor.torch.utils.utility.get_double_quant_config_dict(double_quant_type='BNB_NF4')[source]

Query config dict of double_quant according to double_quant_type.

Parameters:

double_quant_type (str, optional) – double_quant type. Defaults to “BNB_NF4”.

neural_compressor.torch.utils.utility.get_quantizer(model, quantizer_cls, quant_config=None, *args, **kwargs)[source]

Get the quantizer.

Initialize a quantizer or get quantizer attribute from model.

Parameters:
  • model (torch.nn.Module) – pytorch model.

  • quantizer_cls (Quantizer) – quantizer class of a specific algorithm.

  • quant_config (dict, optional) – Specifies how to apply the algorithm on the given model. Defaults to None.

Returns:

quantizer object.

neural_compressor.torch.utils.utility.postprocess_model(model, mode, quantizer)[source]

Process quantizer attribute of model according to current phase.

In prepare phase, the quantizer is set as an attribute of the model to avoid redundant initialization during convert phase.

In ‘convert’ or ‘quantize’ phase, the unused quantizer attribute is removed.

Parameters:
  • model (torch.nn.Module) – pytorch model.

  • mode (Mode) – The mode of current phase, including ‘prepare’, ‘convert’ and ‘quantize’.

  • quantizer (Quantizer) – quantizer object.

neural_compressor.torch.utils.utility.dump_model_op_stats(mode, tune_cfg)[source]

Dump quantizable ops stats of model to user.

Parameters:
  • mode (object) – quantization mode.

  • tune_cfg (dict) – quantization config

neural_compressor.torch.utils.utility.get_model_device(model: torch.nn.Module)[source]

Get the device.

Parameters:

model (torch.nn.Module) – the input model.

Returns:

a string.

Return type:

device (str)

neural_compressor.torch.utils.utility.get_processor_type_from_user_config(user_processor_type: str | neural_compressor.common.utils.ProcessorType | None = None)[source]

Get the processor type.

Get the processor type based on the user configuration or automatically detect it based on the hardware.

Parameters:

user_processor_type (Optional[Union[str, ProcessorType]]) – The user-specified processor type. Defaults to None.

Returns:

The detected or user-specified processor type.

Return type:

ProcessorType

Raises:
  • AssertionError – If the user-specified processor type is not supported.

  • NotImplementedError – If the processor type is not recognized.

neural_compressor.torch.utils.utility.dowload_hf_model(repo_id, cache_dir=None, repo_type=None, revision=None)[source]

Download hugging face model from hf hub.

neural_compressor.torch.utils.utility.load_empty_model(pretrained_model_name_or_path, cls=None, **kwargs)[source]

Load a empty model.