neural_compressor.adaptor.ox_utils.calibration

Calibration for onnx models.

Module Contents

Classes

ONNXRTAugment

augment input model to dump tensor or for calibration.

class neural_compressor.adaptor.ox_utils.calibration.ONNXRTAugment(model_wrapper, dataloader, dump_op_types, black_nodes=[], white_nodes=[], iterations=[], backend=['CPUExecutionProvider'], reduce_range=False)

augment input model to dump tensor or for calibration.

augment_graph(activation_only=False, weight_only=False)

Augment_graph.

Adds nodes to all quantization_candidates op type nodes in model and ensures their outputs are stored as part of the graph output.

Parameters:
  • activation_only (bool, optional) – whether to dump activation tensor only. Defaults to False.

  • weight_only (bool, optional) – whether to dump weight_only. Defaults to False.

get_intermediate_outputs(calib_mode=None)

Gather intermediate model outputs after running inference.

dump_minmax(calib_mode='naive')

Get min/max values of tensors.

dump_calibration(q_config, calib_mode='naive')

Gather calibration params for quantization.

Parameters:
  • q_config (dict) – op-wise quantization config

  • calib_mode (str, optional) – type ‘naive’ gives (Min, Max) pairs for each intermediate model output across test data sets, where the first element is a minimum of all values and the second element is a maximum of all values. Defaults to ‘naive’.

calculate_quantization_params(q_config, quantization_thresholds)

Given quantization thresholds, calculate the quantization params.

Parameters:
  • q_config (dict) – op-wise quantization config

  • quantization_thresholds (dict) – Dictionary specifying the min and max values or outputs of conv and matmul nodes, should be specified in the following format: {“param_name”: [min, max]}

dump_tensor(activation=True, weight=False)

Dump activation or weight or both from the model.

calculate_scale_zeropoint(last_node, next_node, rmin, rmax, scheme, qType, quantize_range)

Given the source and destination node of tensor, return calculated zero point and scales.