neural_compressor.tensorflow.algorithms.smoother.scaler
Tensorflow scaling model weights and activations for Smooth Quantization.
Classes
A class for scaling model weights using Smooth Quantization method. |
|
A class for scaling model weights for TF LLM models using Smooth Quantization method. |
Module Contents
- class neural_compressor.tensorflow.algorithms.smoother.scaler.SmoothQuantScaler(model, dataloader, alpha, scales_per_op)[source]
A class for scaling model weights using Smooth Quantization method.
- Parameters:
model – Tensorflow model to be scaled
dataloader – Tensorflow dataloader for the dataset
alpha – float, the scaling factor
scales_per_op – bool, each op will have an individual scale or ops with the same input will share a scale
- class neural_compressor.tensorflow.algorithms.smoother.scaler.SmoothQuantScalerLLM(graph_def, alpha, scales_per_op, op_types)[source]
A class for scaling model weights for TF LLM models using Smooth Quantization method.
- Parameters:
graph_def – graph_def of the model to be scaled
alpha – float, the scaling factor
scales_per_op – bool, each op will have an individual scale or ops with the same input will share a scale
op_types