`neural_compressor.torch.algorithms.smooth_quant.smooth_quant`

Module Contents

Functions

smooth_quantize(model, tune_cfg, run_fn, example_inputs)

Execute the quantize process on the specified model.

neural_compressor.torch.algorithms.smooth_quant.smooth_quant.smooth_quantize(model, tune_cfg, run_fn, example_inputs, inplace=True)[source]

Execute the quantize process on the specified model.

Parameters:

model – a float model to be quantized.
tune_cfg – quantization config for ops.
run_fn – a calibration function for calibrating the model.
example_inputs – used to trace torch model.
inplace – whether to carry out model transformations in-place.

Returns:

A quantized model.

neural_compressor.torch.algorithms.smooth_quant.smooth_quant

Module Contents

Functions

`neural_compressor.torch.algorithms.smooth_quant.smooth_quant`