neural_compressor.adaptor.tf_utils.graph_converter
¶
Graph Converter Class.
Module Contents¶
Classes¶
Graph Converter Class is used to generate the quantization graph. |
- class neural_compressor.adaptor.tf_utils.graph_converter.GraphConverter(model, qt_config={}, recipes={}, int8_sequences={}, fp32_ops=[], bf16_ops=[], data_loader=None, fake_quant=False, itex_mode=False, qdq_enabled=False, new_api=False, performance_only=False, use_bf16=False)¶
Graph Converter Class is used to generate the quantization graph.
- convert()¶
Do convertion.
- Including:
optimize fp32_frozen_graph,
quantize graph,
calibration,
fuse RequantizeOp with fused quantized conv, and so on.
bf16 convert if the self.bf16_ops is not empty
- Returns:
- quantize()¶
Quantize graph only (without optimizing fp32 graph).
- Including:
quantize graph,
calibration,
fuse RequantizeOp with fused quantized conv, and so on.
- Returns:
- bf16_convert()¶
Convert fp32 nodes in bf16_node to bf16 dtype based on FP32 + INT8 mixed precision graph.
- quantize_with_qdq_pattern()¶
Quantize model by inserting QDQ.
step 1: insert QDQ pairs and update node info step 2: convert Q-DQ-node-Q-DQ to Q-newAPI node-DQ