neural_compressor.torch.algorithms.smooth_quant.smooth_quant

Module Contents

Functions

smooth_quantize(model, tune_cfg, run_fn, example_inputs)

Execute the quantize process on the specified model.

neural_compressor.torch.algorithms.smooth_quant.smooth_quant.smooth_quantize(model, tune_cfg, run_fn, example_inputs, inplace=True)[source]

Execute the quantize process on the specified model.

Parameters:
  • model – a float model to be quantized.

  • tune_cfg – quantization config for ops.

  • run_fn – a calibration function for calibrating the model.

  • example_inputs – used to trace torch model.

  • inplace – whether to carry out model transformations in-place.

Returns:

A quantized model.