neural_compressor.onnxrt.quantization.autotune

Module Contents

Functions

get_all_config_set(...)

autotune(→ Union[None, onnx.ModelProto])

The main entry of auto-tune.

neural_compressor.onnxrt.quantization.autotune.autotune(model_input: pathlib.Path | str, tune_config: neural_compressor.common.base_tuning.TuningConfig, eval_fn: Callable, eval_args: Tuple[Any] | None = None, calibration_data_reader: neural_compressor.onnxrt.quantization.calibrate.CalibrationDataReader = None) None | onnx.ModelProto[source]

The main entry of auto-tune.

Parameters:
  • model_input (Union[Path, str]) – onnx model path.

  • tune_config (TuningConfig) – tuning config. TuningConfig is created with algorithm configs, parameters supported tuning are in their params_list. Support: Expand parameters to a list of parameters like TuningConfig(config_set=[RTNConfig(weight_bits=[4, 8])]) Pass a list of configs like TuningConfig(config_set=[RTNConfig(), GPTQConfig()])

  • eval_fn (Callable) – evaluate function. During evaluation, autotune will only pass model path as the input of function.

  • eval_args (Optional[Tuple[Any]]) – evaluate arguments. Positional arguments for eval_fn.

  • calibration_data_reader (CalibrationDataReader) – dataloader for calibration.