neural_compressor.tensorflow.quantization.utils.transform_graph.bias_correction

Bias correction graph transform.

Module Contents

Classes

BiasCorrection

This class implements the bias correction graph transform.

class neural_compressor.tensorflow.quantization.utils.transform_graph.bias_correction.BiasCorrection(input_graph, fp32_graph, method='weight_empirical', new_api=False)[source]

This class implements the bias correction graph transform.

Will correct the weight and scale for Conv2D op weight_empirical: our task is to correct int8 weight distribution close to fp32 weight r*(W_int8 + u) -> W_fp32, r is variance ratio between fp32 and int8 u is the difference between fp32 and int8 channel wise, it’s equal to minimize: round(scale_c * (W_fp32 + shift))/scale - r*(round(scale * W_fp32) + scale*u)/scale notice we can only change the first round: round(scale_c * (W_fp32 + shift)) an empirical solution is to make: scale_c = r * scale and shift = u with this we don’t change the min/max value, and correct the weight