`neural_compressor.adaptor.tf_utils.graph_converter`¶

Graph Converter Class.

Module Contents¶

Classes¶

GraphConverter

Graph Converter Class is used to generate the quantization graph.

class neural_compressor.adaptor.tf_utils.graph_converter.GraphConverter(model, qt_config={}, recipes={}, int8_sequences={}, fp32_ops=[], bf16_ops=[], data_loader=None, fake_quant=False, itex_mode=False, qdq_enabled=False, new_api=False, performance_only=False, use_bf16=False)¶

Graph Converter Class is used to generate the quantization graph.

convert()¶

Do convertion.

Including:

optimize fp32_frozen_graph,
quantize graph,
calibration,
fuse RequantizeOp with fused quantized conv, and so on.
bf16 convert if the self.bf16_ops is not empty

Returns:

quantize()¶

Quantize graph only (without optimizing fp32 graph).

Including:

quantize graph,
calibration,
fuse RequantizeOp with fused quantized conv, and so on.

Returns:

bf16_convert()¶: Convert fp32 nodes in bf16_node to bf16 dtype based on FP32 + INT8 mixed precision graph.

quantize_with_qdq_pattern()¶

Quantize model by inserting QDQ.

step 1: insert QDQ pairs and update node info step 2: convert Q-DQ-node-Q-DQ to Q-newAPI node-DQ

neural_compressor.adaptor.tf_utils.graph_converter¶

Module Contents¶

Classes¶

`neural_compressor.adaptor.tf_utils.graph_converter`¶