:orphan:

:py:mod:`neural_compressor.onnxrt.quantization.config`
======================================================

.. py:module:: neural_compressor.onnxrt.quantization.config


Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   neural_compressor.onnxrt.quantization.config.RTNConfig
   neural_compressor.onnxrt.quantization.config.GPTQConfig
   neural_compressor.onnxrt.quantization.config.AWQConfig
   neural_compressor.onnxrt.quantization.config.SmoohQuantConfig


Functions
~~~~~~~~~

.. autoapisummary::

   neural_compressor.onnxrt.quantization.config.get_default_rtn_config
   neural_compressor.onnxrt.quantization.config.get_default_gptq_config
   neural_compressor.onnxrt.quantization.config.get_default_awq_config
   neural_compressor.onnxrt.quantization.config.get_default_sq_config


Attributes
~~~~~~~~~~

.. autoapisummary::

   neural_compressor.onnxrt.quantization.config.FRAMEWORK_NAME


.. py:class:: RTNConfig(weight_dtype: str = 'int', weight_bits: int = 4, weight_group_size: int = 32, weight_sym: bool = True, act_dtype: str = 'fp32', accuracy_level: int = 0, providers: List[str] = ['CPUExecutionProvider'], layer_wise_quant: bool = False, white_list: List[neural_compressor.common.utils.OP_NAME_OR_MODULE_TYPE] = DEFAULT_WHITE_LIST)


   Config class for round-to-nearest weight-only quantization.


.. py:function:: get_default_rtn_config() -> RTNConfig

   Generate the default rtn config.

   :returns: the default rtn config.


.. py:class:: GPTQConfig(weight_dtype: str = 'int', weight_bits: int = 4, weight_group_size: int = 32, weight_sym: bool = True, act_dtype: str = 'fp32', accuracy_level: int = 0, percdamp: float = 0.01, blocksize: int = 128, actorder: bool = False, mse: bool = False, perchannel: bool = True, providers: List[str] = ['CPUExecutionProvider'], layer_wise_quant: bool = False, white_list: List[neural_compressor.common.utils.OP_NAME_OR_MODULE_TYPE] = DEFAULT_WHITE_LIST)


   Config class for gptq weight-only quantization.


.. py:function:: get_default_gptq_config() -> GPTQConfig

   Generate the default gptq config.

   :returns: the default gptq config.


.. py:class:: AWQConfig(weight_dtype: str = 'int', weight_bits: int = 4, weight_group_size: int = 32, weight_sym: bool = True, act_dtype: str = 'fp32', accuracy_level: int = 0, enable_auto_scale: bool = True, enable_mse_search: bool = True, providers: List[str] = ['CPUExecutionProvider'], white_list: List[neural_compressor.common.utils.OP_NAME_OR_MODULE_TYPE] = DEFAULT_WHITE_LIST)


   Config class for awq weight-only quantization.


.. py:function:: get_default_awq_config() -> AWQConfig

   Generate the default awq config.

   :returns: the default awq config.


.. py:class:: SmoohQuantConfig(alpha: float = 0.5, folding: bool = True, op_types: List[str] = ['Gemm', 'Conv', 'MatMul', 'FusedConv'], calib_iter: int = 100, scales_per_op: bool = True, auto_alpha_args: dict = {'alpha_min': 0.3, 'alpha_max': 0.7, 'alpha_step': 0.05, 'attn_method': 'min'}, providers: List[str] = ['CPUExecutionProvider'], white_list: List[neural_compressor.common.utils.OP_NAME_OR_MODULE_TYPE] = DEFAULT_WHITE_LIST, **kwargs)


   Smooth quant quantization config.


.. py:function:: get_default_sq_config() -> SmoohQuantConfig

   Generate the default smooth quant config.

   :returns: the default smooth quant config.