:orphan:

:py:mod:`neural_compressor.adaptor.torch_utils.waq.smooth_quant`
================================================================

.. py:module:: neural_compressor.adaptor.torch_utils.waq.smooth_quant


Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   neural_compressor.adaptor.torch_utils.waq.smooth_quant.TorchSmoothQuant


.. py:class:: TorchSmoothQuant(model, dataloader=None, example_inputs=None, q_func=None, traced_model=None)


   Fake input channel quantization, for more details please refer to
   [1] SmoothQuant: Accurate and Efficient
   Post-Training Quantization for Large Language Models
   [2] SPIQ: Data-Free Per-Channel Static Input Quantization
   Currently, we only handle the layers whose smooth scale could be absorbed, we will support other layers later.

   We only support inplace mode which means the model weights will be changed, you can call recover function
   to recover the weights if needed