neural_compressor.adaptor.tf_utils.smooth_quant_calibration

Tensorflow model calibration process for Smooth Quantization.

Module Contents

Classes

SmoothQuantCalibration

A class for performing smooth quantization calibration on a Tensorflow model.

SmoothQuantCalibrationLLM

A class for performing smooth quantization calibration on a Tensorflow LLM model.

class neural_compressor.adaptor.tf_utils.smooth_quant_calibration.SmoothQuantCalibration(model, dataloader, iterations, op_types, percentile)[source]

A class for performing smooth quantization calibration on a Tensorflow model.

Parameters:
  • model (Model) – The Tensorflow wrapper model to be calibrated.

  • dataloader (DataLoader) – The data loader for the calibration dataset.

  • iterations (int) – The number of iterations to run the calibration process.

  • op_types (List[str]) – The types of operations to be quantized.

  • percentile (float) – The percentile of calibration to remove outliers.

class neural_compressor.adaptor.tf_utils.smooth_quant_calibration.SmoothQuantCalibrationLLM(model_path, dataloader, iterations, op_types, percentile, temp_path, weight_name_mapping)[source]

A class for performing smooth quantization calibration on a Tensorflow LLM model.

Parameters:
  • model (str) – A path to the original Tensorflow model.

  • iterations (int) – The number of iterations to run the calibration process.

  • op_types (List[str]) – The types of operations to be quantized.

  • percentile (float) – The percentile of calibration to remove outliers.

  • eval_func (function) – The function to inference the model.

  • temp_path (str) – The temporary path to store median model.

  • () (weight_name_mapping) – A function that convert weight tensor name in autotrackable to node name in graph_def