:orphan: :py:mod:`neural_compressor.torch.quantization.config` ===================================================== .. py:module:: neural_compressor.torch.quantization.config Module Contents --------------- Classes ~~~~~~~ .. autoapisummary:: neural_compressor.torch.quantization.config.RTNConfig neural_compressor.torch.quantization.config.GPTQConfig neural_compressor.torch.quantization.config.HQQConfig Functions ~~~~~~~~~ .. autoapisummary:: neural_compressor.torch.quantization.config.get_default_rtn_config neural_compressor.torch.quantization.config.get_default_gptq_config neural_compressor.torch.quantization.config.get_default_hqq_config neural_compressor.torch.quantization.config.get_woq_tuning_config .. py:class:: RTNConfig(dtype: str = 'int', bits: int = 4, use_sym: bool = True, group_size: int = 32, group_dim: int = 1, use_full_range: bool = False, use_mse_search: bool = False, use_layer_wise: bool = False, model_path: str = '', use_double_quant: bool = False, double_quant_dtype: str = 'int', double_quant_bits: int = 8, double_quant_use_sym: bool = False, double_quant_group_size: int = 256, white_list: Optional[List[neural_compressor.common.utils.OP_NAME_OR_MODULE_TYPE]] = DEFAULT_WHITE_LIST) Config class for round-to-nearest weight-only quantization. .. py:function:: get_default_rtn_config() -> RTNConfig Generate the default rtn config. :returns: the default rtn config. .. py:class:: GPTQConfig(dtype: str = 'int', bits: int = 4, use_sym: bool = True, group_size: int = 32, use_mse_search: bool = False, use_layer_wise: bool = False, model_path: str = '', use_double_quant: bool = False, double_quant_dtype: str = 'int', double_quant_bits: int = 8, double_quant_use_sym: bool = False, double_quant_group_size: int = 256, act_order: bool = False, percdamp: float = 0.01, block_size: int = 2048, static_groups: bool = False, white_list: Optional[List[neural_compressor.common.utils.OP_NAME_OR_MODULE_TYPE]] = DEFAULT_WHITE_LIST) Config class for GPTQ. GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers. https://arxiv.org/abs/2210.17323 .. py:function:: get_default_gptq_config() -> GPTQConfig Generate the default gptq config. :returns: the default gptq config. .. py:class:: HQQConfig(bits: int = 4, group_size: int = 64, quant_zero: bool = True, quant_scale: bool = False, scale_quant_group_size: int = 128, skip_lm_head: bool = True, white_list: Optional[List[neural_compressor.common.utils.OP_NAME_OR_MODULE_TYPE]] = DEFAULT_WHITE_LIST) The base config for all algorithm configs. .. py:function:: get_default_hqq_config() -> HQQConfig Generate the default HQQ config. :returns: the default HQQ config. .. py:function:: get_woq_tuning_config() -> list Generate the config set for WOQ tuning. :returns: the list of WOQ quant config.