neural_compressor.tensorflow.algorithms.smoother.calibration

Tensorflow model calibration process for Smooth Quantization.

Classes

`SmoothQuantCalibration`	A class for performing smooth quantization calibration on a Tensorflow model.
`SmoothQuantCalibrationLLM`	A class for performing smooth quantization calibration on a Tensorflow LLM model.

Module Contents

class neural_compressor.tensorflow.algorithms.smoother.calibration.SmoothQuantCalibration(model, dataloader, iterations, op_types, percentile)[source]

A class for performing smooth quantization calibration on a Tensorflow model.

Parameters:

model (Model) – The Tensorflow wrapper model to be calibrated.
dataloader (DataLoader) – The data loader for the calibration dataset.
iterations (int) – The number of iterations to run the calibration process.
op_types (List[str]) – The types of operations to be quantized.
percentile (float) – The percentile of calibration to remove outliers.

class neural_compressor.tensorflow.algorithms.smoother.calibration.SmoothQuantCalibrationLLM(model_path, dataloader, iterations, op_types, percentile, temp_path, weight_name_mapping)[source]

A class for performing smooth quantization calibration on a Tensorflow LLM model.

Parameters:

model (str) – A path to the original Tensorflow model.
iterations (int) – The number of iterations to run the calibration process.
op_types (List[str]) – The types of operations to be quantized.
percentile (float) – The percentile of calibration to remove outliers.
eval_func (function) – The function to inference the model.
temp_path (str) – The temporary path to store median model.
() (weight_name_mapping) – A function that convert weight tensor name in autotrackable to node name in graph_def