neural_compressor.adaptor.torch_utils.teq

Module Contents

Classes

TEQuantizer

Weight-only quantization, Trainable Equivalent Transformation (TEQ): linear wrapper to apply scale to input.

class neural_compressor.adaptor.torch_utils.teq.TEQuantizer(model, weight_config={}, absorb_to_layer={}, extra_config={}, example_inputs=None)[source]

Weight-only quantization, Trainable Equivalent Transformation (TEQ): linear wrapper to apply scale to input.