:orphan: :py:mod:`neural_compressor.adaptor.torch_utils.waq.smooth_quant` ================================================================ .. py:module:: neural_compressor.adaptor.torch_utils.waq.smooth_quant Module Contents --------------- Classes ~~~~~~~ .. autoapisummary:: neural_compressor.adaptor.torch_utils.waq.smooth_quant.TorchSmoothQuant .. py:class:: TorchSmoothQuant(model, dataloader=None, example_inputs=None, q_func=None, traced_model=None) Fake input channel quantization, for more details please refer to [1] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models [2] SPIQ: Data-Free Per-Channel Static Input Quantization Currently, we only handle the layers whose smooth scale could be absorbed, we will support other layers later. We only support inplace mode which means the model weights will be changed, you can call recover function to recover the weights if needed