neural_compressor.model.torch_model
¶
Class for PyTorch model.
Module Contents¶
Classes¶
Build PyTorch base model. |
|
Build PyTorchModel object. |
|
Build PyTorchFXModel object. |
|
Build IPEXModel object. |
- class neural_compressor.model.torch_model.PyTorchBaseModel(model, **kwargs)¶
Bases:
torch
,neural_compressor.model.base_model.BaseModel
Build PyTorch base model.
- property model¶
Getter to model.
- property fp32_model¶
Getter to model.
- forward(*args, **kwargs)¶
Pytorch model forward func.
- register_forward_pre_hook()¶
Register forward pre hook.
- remove_hooks()¶
Remove hooks.
- generate_forward_pre_hook()¶
Generate forward pre hook.
- framework()¶
Return framework.
- get_all_weight_names()¶
Get weight names.
- get_weight(tensor_name)¶
Get weight value.
- update_weights(tensor_name, new_tensor)¶
Update weight value.
- Parameters:
tensor_name (string) – weight name.
new_tensor (ndarray) – weight value.
- update_gradient(grad_name, new_grad)¶
Update grad value.
- Parameters:
grad_name (str) – grad name.
new_grad (ndarray) – grad value.
- prune_weights_(tensor_name, mask)¶
Prune weight in place according to tensor_name with mask.
- Parameters:
tensor_name (str) – weight name.
mask (tensor) – pruning mask.
- get_inputs(input_name=None)¶
Get inputs of model.
- Parameters:
input_name (str, optional) – name of input tensor. Defaults to None.
- Returns:
input tensor
- Return type:
tensor
- get_gradient(input_tensor)¶
Get gradients of specific tensor.
- Parameters:
input_tensor (string or tensor) – weight name or a tensor.
- Returns:
gradient tensor array
- Return type:
ndarray
- report_sparsity()¶
Get sparsity of the model.
- Returns:
DataFrame of sparsity of each weight. total_sparsity (float): total sparsity of model.
- Return type:
df (DataFrame)
- class neural_compressor.model.torch_model.PyTorchModel(model, **kwargs)¶
Bases:
PyTorchBaseModel
Build PyTorchModel object.
- property workspace_path¶
Return workspace path.
- property graph_info¶
Return graph info.
- save(root=None)¶
Save configure file and weights.
- quantized_state_dict()¶
Load quantized state dict.
- load_quantized_state_dict(stat_dict)¶
Load quantized state with given dict.
- export_to_jit(example_inputs=None)¶
Export JIT model.
- export_to_fp32_onnx(save_path='fp32-model.onnx', example_inputs=torch.rand([1, 1, 1, 1]), opset_version=14, dynamic_axes={'input': {0: 'batch_size'}, 'output': {0: 'batch_size'}}, input_names=None, output_names=None, do_constant_folding=True, verbose=True, fp32_model=None)¶
Export PyTorch FP32 model to ONNX FP32 model.
- Parameters:
save_path (str, optional) – ONNX model path to save. Defaults to ‘fp32-model.onnx’.
example_inputs (torch.Tensor, optional) – example inputs for export. Defaults to torch.rand([1, 1, 1, 1]).
opset_version (int, optional) – opset version for exported ONNX model. Defaults to 14.
dynamic_axes (dict, optional) – specify axes of tensors as dynamic. Defaults to {“input”: {0: “batch_size”}, “output”: {0: “batch_size”}}.
input_names (list or str, optional) – names to assign to the input nodes of the graph, in order. Defaults to None.
output_names (list or str, optional) – names to assign to the output nodes of the graph, in order. Defaults to None.
do_constant_folding (bool, optional) – Apply the constant-folding optimization. Defaults to True.
verbose (bool, optional) – if True, prints a description of the model being exported to stdout. Defaults to True.
fp32_model (torch.nn.model, optional) – FP32 PyTorch model. Defaults to None.
- export_to_bf16_onnx(save_path='bf16-model.onnx', example_inputs=torch.rand([1, 1, 1, 1]), opset_version=14, dynamic_axes={'input': {0: 'batch_size'}, 'output': {0: 'batch_size'}}, input_names=None, output_names=None, do_constant_folding=True, verbose=True)¶
Export PyTorch bf16 model to ONNX bf16 model.
- Parameters:
save_path (str, optional) – ONNX model path to save. Defaults to ‘bf16-model.onnx’.
example_inputs (torch.Tensor, optional) – example inputs for export. Defaults to torch.rand([1, 1, 1, 1]).
opset_version (int, optional) – opset version for exported ONNX model. Defaults to 14.
dynamic_axes (dict, optional) – specify axes of tensors as dynamic. Defaults to {“input”: {0: “batch_size”}, “output”: {0: “batch_size”}}.
input_names (list or str, optional) – names to assign to the input nodes of the graph, in order. Defaults to None.
output_names (list or str, optional) – names to assign to the output nodes of the graph, in order. Defaults to None.
do_constant_folding (bool, optional) – Apply the constant-folding optimization. Defaults to True.
verbose (bool, optional) – if True, prints a description of the model being exported to stdout. Defaults to True.
- export_to_int8_onnx(save_path='int8-model.onnx', example_inputs=torch.rand([1, 1, 1, 1]), opset_version=14, dynamic_axes={'input': {0: 'batch_size'}, 'output': {0: 'batch_size'}}, input_names=None, output_names=None, do_constant_folding=True, quant_format='QDQ', dtype='S8S8', fp32_model=None, calib_dataloader=None)¶
Export PyTorch int8 model to ONNX int8 model.
- Parameters:
save_path (str, optional) – ONNX model path to save. Defaults to ‘int8-model.onnx’.
example_inputs (torch.Tensor, optional) – example inputs for export. Defaults to torch.rand([1, 1, 1, 1]).
opset_version (int, optional) – opset version for exported ONNX model. Defaults to 14.
dynamic_axes (dict, optional) – specify axes of tensors as dynamic. Defaults to {“input”: {0: “batch_size”}, “output”: {0: “batch_size”}}.
input_names (list or str, optional) – names to assign to the input nodes of the graph, in order. Defaults to None.
output_names (list or str, optional) – names to assign to the output nodes of the graph, in order. Defaults to None.
do_constant_folding (bool, optional) – Apply the constant-folding optimization. Defaults to True.
quant_format (str, optional) – format of quantized ONNX model. Defaults to ‘QDQ’.
dtype (str, optional) – type for quantized activation and weight. Defaults to ‘S8S8’.
fp32_model (torch.nn.model, optional) – FP32 PyTorch model. Defaults to None.
calib_dataloader (object, optional) – calibration dataloader. Defaults to None.
- export(save_path: str, conf)¶
Export PyTorch model to ONNX model.
- class neural_compressor.model.torch_model.PyTorchFXModel(model, **kwargs)¶
Bases:
PyTorchModel
Build PyTorchFXModel object.
- class neural_compressor.model.torch_model.IPEXModel(model, **kwargs)¶
Bases:
PyTorchBaseModel
Build IPEXModel object.
- property workspace_path¶
Return workspace path.
- save(root=None)¶
Save PyTorch IPEX model.