neural_compressor.model.torch_model

Class for PyTorch model.

Module Contents

Classes

PyTorchBaseModel

Build PyTorch base model.

PyTorchModel

Build PyTorchModel object.

PyTorchFXModel

Build PyTorchFXModel object.

IPEXModel

Build IPEXModel object.

class neural_compressor.model.torch_model.PyTorchBaseModel(model, **kwargs)

Bases: torch, neural_compressor.model.base_model.BaseModel

Build PyTorch base model.

property model

Getter to model.

property fp32_model

Getter to model.

forward(*args, **kwargs)

Pytorch model forward func.

register_forward_pre_hook()

Register forward pre hook.

remove_hooks()

Remove hooks.

generate_forward_pre_hook()

Generate forward pre hook.

framework()

Return framework.

get_all_weight_names()

Get weight names.

get_weight(tensor_name)

Get weight value.

update_weights(tensor_name, new_tensor)

Update weight value.

Parameters:
  • tensor_name (string) – weight name.

  • new_tensor (ndarray) – weight value.

update_gradient(grad_name, new_grad)

Update grad value.

Parameters:
  • grad_name (str) – grad name.

  • new_grad (ndarray) – grad value.

prune_weights_(tensor_name, mask)

Prune weight in place according to tensor_name with mask.

Parameters:
  • tensor_name (str) – weight name.

  • mask (tensor) – pruning mask.

get_inputs(input_name=None)

Get inputs of model.

Parameters:

input_name (str, optional) – name of input tensor. Defaults to None.

Returns:

input tensor

Return type:

tensor

get_gradient(input_tensor)

Get gradients of specific tensor.

Parameters:

input_tensor (string or tensor) – weight name or a tensor.

Returns:

gradient tensor array

Return type:

ndarray

report_sparsity()

Get sparsity of the model.

Returns:

DataFrame of sparsity of each weight. total_sparsity (float): total sparsity of model.

Return type:

df (DataFrame)

class neural_compressor.model.torch_model.PyTorchModel(model, **kwargs)

Bases: PyTorchBaseModel

Build PyTorchModel object.

property workspace_path

Return workspace path.

property graph_info

Return graph info.

save(root=None)

Save configure file and weights.

quantized_state_dict()

Load quantized state dict.

load_quantized_state_dict(stat_dict)

Load quantized state with given dict.

export_to_jit(example_inputs=None)

Export JIT model.

export_to_fp32_onnx(save_path='fp32-model.onnx', example_inputs=torch.rand([1, 1, 1, 1]), opset_version=14, dynamic_axes={'input': {0: 'batch_size'}, 'output': {0: 'batch_size'}}, input_names=None, output_names=None, do_constant_folding=True, verbose=True, fp32_model=None)

Export PyTorch FP32 model to ONNX FP32 model.

Parameters:
  • save_path (str, optional) – ONNX model path to save. Defaults to ‘fp32-model.onnx’.

  • example_inputs (torch.Tensor, optional) – example inputs for export. Defaults to torch.rand([1, 1, 1, 1]).

  • opset_version (int, optional) – opset version for exported ONNX model. Defaults to 14.

  • dynamic_axes (dict, optional) – specify axes of tensors as dynamic. Defaults to {“input”: {0: “batch_size”}, “output”: {0: “batch_size”}}.

  • input_names (list or str, optional) – names to assign to the input nodes of the graph, in order. Defaults to None.

  • output_names (list or str, optional) – names to assign to the output nodes of the graph, in order. Defaults to None.

  • do_constant_folding (bool, optional) – Apply the constant-folding optimization. Defaults to True.

  • verbose (bool, optional) – if True, prints a description of the model being exported to stdout. Defaults to True.

  • fp32_model (torch.nn.model, optional) – FP32 PyTorch model. Defaults to None.

export_to_bf16_onnx(save_path='bf16-model.onnx', example_inputs=torch.rand([1, 1, 1, 1]), opset_version=14, dynamic_axes={'input': {0: 'batch_size'}, 'output': {0: 'batch_size'}}, input_names=None, output_names=None, do_constant_folding=True, verbose=True)

Export PyTorch bf16 model to ONNX bf16 model.

Parameters:
  • save_path (str, optional) – ONNX model path to save. Defaults to ‘bf16-model.onnx’.

  • example_inputs (torch.Tensor, optional) – example inputs for export. Defaults to torch.rand([1, 1, 1, 1]).

  • opset_version (int, optional) – opset version for exported ONNX model. Defaults to 14.

  • dynamic_axes (dict, optional) – specify axes of tensors as dynamic. Defaults to {“input”: {0: “batch_size”}, “output”: {0: “batch_size”}}.

  • input_names (list or str, optional) – names to assign to the input nodes of the graph, in order. Defaults to None.

  • output_names (list or str, optional) – names to assign to the output nodes of the graph, in order. Defaults to None.

  • do_constant_folding (bool, optional) – Apply the constant-folding optimization. Defaults to True.

  • verbose (bool, optional) – if True, prints a description of the model being exported to stdout. Defaults to True.

export_to_int8_onnx(save_path='int8-model.onnx', example_inputs=torch.rand([1, 1, 1, 1]), opset_version=14, dynamic_axes={'input': {0: 'batch_size'}, 'output': {0: 'batch_size'}}, input_names=None, output_names=None, do_constant_folding=True, quant_format='QDQ', dtype='S8S8', fp32_model=None, calib_dataloader=None)

Export PyTorch int8 model to ONNX int8 model.

Parameters:
  • save_path (str, optional) – ONNX model path to save. Defaults to ‘int8-model.onnx’.

  • example_inputs (torch.Tensor, optional) – example inputs for export. Defaults to torch.rand([1, 1, 1, 1]).

  • opset_version (int, optional) – opset version for exported ONNX model. Defaults to 14.

  • dynamic_axes (dict, optional) – specify axes of tensors as dynamic. Defaults to {“input”: {0: “batch_size”}, “output”: {0: “batch_size”}}.

  • input_names (list or str, optional) – names to assign to the input nodes of the graph, in order. Defaults to None.

  • output_names (list or str, optional) – names to assign to the output nodes of the graph, in order. Defaults to None.

  • do_constant_folding (bool, optional) – Apply the constant-folding optimization. Defaults to True.

  • quant_format (str, optional) – format of quantized ONNX model. Defaults to ‘QDQ’.

  • dtype (str, optional) – type for quantized activation and weight. Defaults to ‘S8S8’.

  • fp32_model (torch.nn.model, optional) – FP32 PyTorch model. Defaults to None.

  • calib_dataloader (object, optional) – calibration dataloader. Defaults to None.

export(save_path: str, conf)

Export PyTorch model to ONNX model.

class neural_compressor.model.torch_model.PyTorchFXModel(model, **kwargs)

Bases: PyTorchModel

Build PyTorchFXModel object.

class neural_compressor.model.torch_model.IPEXModel(model, **kwargs)

Bases: PyTorchBaseModel

Build IPEXModel object.

property workspace_path

Return workspace path.

save(root=None)

Save PyTorch IPEX model.