neural_compressor.adaptor.tf_utils.smooth_quant_scaler

Tensorflow scaling model weights and activations for Smooth Quantization.

Module Contents

Classes

SmoothQuantScaler

A class for scaling model weights using Smooth Quantization method.

SmoothQuantScalerLLM

A class for scaling model weights for TF LLM models using Smooth Quantization method.

class neural_compressor.adaptor.tf_utils.smooth_quant_scaler.SmoothQuantScaler(model, dataloader, alpha, scales_per_op)[source]

A class for scaling model weights using Smooth Quantization method.

Parameters:
  • model – Tensorflow model to be scaled

  • dataloader – Tensorflow dataloader for the dataset

  • alpha – float, the scaling factor

  • scales_per_op – bool, each op will have an individual scale or ops with the same input will share a scale

class neural_compressor.adaptor.tf_utils.smooth_quant_scaler.SmoothQuantScalerLLM(graph_def, alpha, scales_per_op, op_types)[source]

A class for scaling model weights for TF LLM models using Smooth Quantization method.

Parameters:
  • graph_def – graph_def of the model to be scaled

  • alpha – float, the scaling factor

  • scales_per_op – bool, each op will have an individual scale or ops with the same input will share a scale

  • op_types