neural_compressor.adaptor.torch_utils.layer_wise_quant.quantize

Layer wise quantization.

Module Contents

Classes

LayerWiseQuant

Layer wise quantization.

class neural_compressor.adaptor.torch_utils.layer_wise_quant.quantize.LayerWiseQuant(q_model, pretrained_model_name_or_path, op_cfgs, calib_data, smooth_quant=False, output_dir=None, device='cpu', alpha=0.5)[source]

Layer wise quantization.

Layer-by-layer quantize the model, in order to save memomery.