neural_compressor.adaptor.ox_utils.quantizer
¶
Quantizer for onnx models.
Module Contents¶
Classes¶
Quantizer class. |
- class neural_compressor.adaptor.ox_utils.quantizer.Quantizer(model, q_config, mode, static, quantization_params, op_types_to_quantize, fallback_list=['fp32'], reduce_range=None)¶
Quantizer class.
- check_opset_version()¶
Check opset version.
- should_quantize(node)¶
Check if node should be quantized.
- quantize_model()¶
Quantize onnx model.
- merge_dedicated_qdq_pair()¶
Merge dedicated Q/DQ pairs.
- should_cast(node)¶
Check if node should be casted.
- insert_qdq()¶
Insert Q/DQ pairs.
- should_convert(node)¶
Check if node should be converted.
- convert_qdq_to_operator_oriented()¶
Convert QDQ to QOperator format.
- remove_redundant_pairs()¶
Remove redudant Q/DQ, Cast/Cast pairs.
- dtype_cast(node, cfg, keep_io_types=True)¶
Cast node dtype.
- quantize_outputs(node, initializer_use_weight_qType=True, direct_int8=False)¶
Quantize node outputs.
- quantize_inputs(node, indices=None, initializer_use_weight_qType=True, direct_int8=False)¶
Quantize node inputs.
- quantize_bias_tensor(node)¶
Quantize bias.
- quantize_bias(bias_name, input_name, weight_name, beta=1.0)¶
Quantized the bias.
Zero Point == 0 and Scale == Input_Scale * Weight_Scale
- quantize_weights_per_channel(node, indices, weight_qType, scheme, axis)¶
Quantize weights per-channel.
- quantize_weight_per_channel(weight_name, weight_qType, scheme, channel_axis)¶
Quantize weight per-channel.
- static tensor_proto_to_array(initializer)¶
Convert TensorProto to array.
- get_bias_add_nodes(node, weight_name, last_output, quantized_bias_name)¶
Given a node, this function handles bias add by adding a “reshape” node on bias and an “add” node.
- Parameters:
node (NodeProto) – current node (Conv)
weight_name (string) – weight name
last_output (_type_) – output of previous node (input to bias add)
quantized_bias_name (string) – bias name
- is_valid_quantize_weight(weight_name)¶
Check weight can be quantized.
- dequantize_tensor(node, value_name)¶
Dequantize tensor.