neural_compressor.adaptor.torch_utils.layer_wise_quant.quantize
Layer wise quantization.
Module Contents
Classes
Layer wise quantization. |
- class neural_compressor.adaptor.torch_utils.layer_wise_quant.quantize.LayerWiseQuant(q_model, pretrained_model_name_or_path, op_cfgs, calib_data, smooth_quant=False, output_dir=None, device='cpu', alpha=0.5)[source]
Layer wise quantization.
Layer-by-layer quantize the model, in order to save memomery.