`neural_compressor.adaptor.ox_utils.quantizer`¶

Quantizer for onnx models.

Module Contents¶

Classes¶

Quantizer

Quantizer class.

class neural_compressor.adaptor.ox_utils.quantizer.Quantizer(model, q_config, mode, static, quantization_params, op_types_to_quantize, fallback_list=['fp32'], reduce_range=None)¶

Quantizer class.

check_opset_version()¶: Check opset version.

should_quantize(node)¶: Check if node should be quantized.

quantize_model()¶: Quantize onnx model.

merge_dedicated_qdq_pair()¶: Merge dedicated Q/DQ pairs.

should_cast(node)¶: Check if node should be casted.

insert_qdq()¶: Insert Q/DQ pairs.

should_convert(node)¶: Check if node should be converted.

convert_qdq_to_operator_oriented()¶: Convert QDQ to QOperator format.

remove_redundant_pairs()¶: Remove redudant Q/DQ, Cast/Cast pairs.

dtype_cast(node, cfg, keep_io_types=True)¶: Cast node dtype.

quantize_outputs(node, initializer_use_weight_qType=True, direct_int8=False)¶: Quantize node outputs.

quantize_inputs(node, indices=None, initializer_use_weight_qType=True, direct_int8=False)¶: Quantize node inputs.

quantize_bias_tensor(node)¶: Quantize bias.

quantize_bias(bias_name, input_name, weight_name, beta=1.0)¶

Quantized the bias.

Zero Point == 0 and Scale == Input_Scale * Weight_Scale

quantize_weights_per_channel(node, indices, weight_qType, scheme, axis)¶: Quantize weights per-channel.

quantize_weight_per_channel(weight_name, weight_qType, scheme, channel_axis)¶: Quantize weight per-channel.

static tensor_proto_to_array(initializer)¶: Convert TensorProto to array.

get_bias_add_nodes(node, weight_name, last_output, quantized_bias_name)¶

Given a node, this function handles bias add by adding a “reshape” node on bias and an “add” node.

Parameters:

node (NodeProto) – current node (Conv)
weight_name (string) – weight name
last_output (_type_) – output of previous node (input to bias add)
quantized_bias_name (string) – bias name

is_valid_quantize_weight(weight_name)¶: Check weight can be quantized.

dequantize_tensor(node, value_name)¶: Dequantize tensor.

neural_compressor.adaptor.ox_utils.quantizer¶

Module Contents¶

Classes¶

`neural_compressor.adaptor.ox_utils.quantizer`¶