neural_compressor.compression.pruner.model_slim.auto_slim
Auto slim.
Functions
|
Slim the sparse model automatically. |
|
Remove some sparse part in the model permanently and obtain acceleration directly. |
|
Remove some sparse part in the model permanently and obtain acceleration directly. |
|
Get model slim pruning configs. |
|
Get consecutive linear layers pruning configs. |
|
Get multi-head attention layers pruning configs. |
Module Contents
- neural_compressor.compression.pruner.model_slim.auto_slim.model_slim(model, dataloader=None, round_multiplier=32)[source]
Slim the sparse model automatically.
- neural_compressor.compression.pruner.model_slim.auto_slim.model_slim_ffn2(model, dataloader=None, round_multiplier=32)[source]
Remove some sparse part in the model permanently and obtain acceleration directly.
- Parameters:
model – a sprase model.
round_multiplier (int) – the channel number after slimming should be multiple of this number.
- neural_compressor.compression.pruner.model_slim.auto_slim.model_slim_mha(model, dataloader=None)[source]
Remove some sparse part in the model permanently and obtain acceleration directly.
- Parameters:
model – a sprase model.
- neural_compressor.compression.pruner.model_slim.auto_slim.parse_auto_slim_config(model, dataloader=None, ffn2_sparsity=0.0, mha_sparsity=0.0, **kwargs)[source]
Get model slim pruning configs.