neural_compressor.adaptor.ox_utils.calibration
¶
Calibration for onnx models.
Module Contents¶
Classes¶
augment input model to dump tensor or for calibration. |
- class neural_compressor.adaptor.ox_utils.calibration.ONNXRTAugment(model_wrapper, dataloader, dump_op_types, black_nodes=[], white_nodes=[], iterations=[], backend=['CPUExecutionProvider'], reduce_range=False)¶
augment input model to dump tensor or for calibration.
- augment_graph(activation_only=False, weight_only=False)¶
Augment_graph.
Adds nodes to all quantization_candidates op type nodes in model and ensures their outputs are stored as part of the graph output.
- Parameters:
activation_only (bool, optional) – whether to dump activation tensor only. Defaults to False.
weight_only (bool, optional) – whether to dump weight_only. Defaults to False.
- get_intermediate_outputs(calib_mode=None)¶
Gather intermediate model outputs after running inference.
- dump_minmax(calib_mode='naive')¶
Get min/max values of tensors.
- dump_calibration(q_config, calib_mode='naive')¶
Gather calibration params for quantization.
- Parameters:
q_config (dict) – op-wise quantization config
calib_mode (str, optional) – type ‘naive’ gives (Min, Max) pairs for each intermediate model output across test data sets, where the first element is a minimum of all values and the second element is a maximum of all values. Defaults to ‘naive’.
- calculate_quantization_params(q_config, quantization_thresholds)¶
Given quantization thresholds, calculate the quantization params.
- Parameters:
q_config (dict) – op-wise quantization config
quantization_thresholds (dict) – Dictionary specifying the min and max values or outputs of conv and matmul nodes, should be specified in the following format: {“param_name”: [min, max]}
- dump_tensor(activation=True, weight=False)¶
Dump activation or weight or both from the model.
- calculate_scale_zeropoint(last_node, next_node, rmin, rmax, scheme, qType, quantize_range)¶
Given the source and destination node of tensor, return calculated zero point and scales.