`neural_compressor.experimental`¶

Intel® Neural Compressor: An open-source Python library supporting popular model compression techniques.

Subpackages¶

Submodules¶

Package Contents¶

Classes¶

`Component`	This is base class of Neural Compressor Component.
`Quantization`	This class provides easy use API for quantization.
`Pruning`	This is base class of pruning object.
`Benchmark`	Benchmark class is used to evaluate the model performance with the objective settings.
`Graph_Optimization`	Graph_Optimization class.
`MixedPrecision`	Class used for generating low precision model.
`ModelConversion`	ModelConversion class is used to convert one model format to another.
`Distillation`	Distillation class derived from Component class.
`NAS`	Create object of different NAS approaches.

Attributes¶

GraphOptimization

class neural_compressor.experimental.Component(conf_fname_or_obj=None, combination=None)¶

Bases: object

This is base class of Neural Compressor Component.

This class will be inherited by the class ‘Quantization’, ‘Pruning’ and ‘Distillation’. This design is mainly for one-shot optimization for pruning/distillation/quantization-aware training. In this class will apply all hooks for ‘Quantization’, ‘Pruning’ and ‘Distillation’.

property train_func¶: Not support get train_func.

property eval_func¶: Not support get eval_func.

property train_dataloader¶: Getter to train dataloader.

property eval_dataloader¶: Getter to eval dataloader.

property model¶: Getter of model in neural_compressor.model.

prepare()¶: Register Quantization Aware Training hooks.

prepare_qat()¶: Register Quantization Aware Training hooks.

pre_process()¶

Initialize some attributes, such as the adaptor, the dataloader and train/eval functions from yaml config.

Component base class provides default function to initialize dataloaders and functions from user config. And for derived classes(Pruning, Quantization, etc.), an override function is required.

execute()¶

Execute the processing of this compressor.

Component base class provides compressing processing. And for derived classes(Pruning, Quantization, etc.), an override function is required.

post_process()¶

Post process after execution.

For derived classes(Pruning, Quantization, etc.), an override function is required.

on_train_begin(dataloader=None)¶: Be called before the beginning of epochs.

on_train_end()¶: Be called after the end of epochs.

pre_epoch_begin(dataloader=None)¶: Be called before the beginning of epochs.

post_epoch_end()¶: Be called after the end of epochs.

on_epoch_begin(epoch)¶: Be called on the beginning of epochs.

on_step_begin(batch_id)¶: Be called on the beginning of batches.

on_batch_begin(batch_id)¶: Be called on the beginning of batches.

on_after_compute_loss(input, student_output, student_loss, teacher_output=None)¶: Be called on the end of loss computation.

on_before_optimizer_step()¶: Be called before optimizer step.

on_after_optimizer_step()¶: Be called after optimizer step.

on_before_eval()¶: Be called before evaluation.

on_after_eval()¶: Be called after evaluation.

on_post_grad()¶: Be called before optimizer step.

on_step_end()¶: Be called on the end of batches.

on_batch_end()¶: Be called on the end of batches.

on_epoch_end()¶: Be called on the end of epochs.

register_hook(scope, hook, input_args=None, input_kwargs=None)¶

Register hook for component.

Input_args and input_kwargs are reserved for user registered hooks.

class neural_compressor.experimental.Quantization(conf_fname_or_obj=None)¶

Bases: neural_compressor.experimental.component.Component

This class provides easy use API for quantization.

It automatically searches for optimal quantization recipes for low precision model inference, achieving best tuning objectives like inference performance within accuracy loss constraints. Tuner abstracts out the differences of quantization APIs across various DL frameworks and brings a unified API for automatic quantization that works on frameworks including tensorflow, pytorch and mxnet. Since DL use cases vary in the accuracy metrics (Top-1, MAP, ROC etc.), loss criteria (<1% or <0.1% etc.) and tuning objectives (performance, memory footprint etc.). Tuner class provides a flexible configuration interface via YAML for users to specify these parameters.

Parameters:: conf_fname_or_obj (string or obj) – The path to the YAML configuration file or QuantConf class containing accuracy goal, tuning objective and preferred calibration & quantization tuning space etc.

property calib_dataloader¶: Get calib_dataloader attribute.

property metric¶: Get metric attribute.

property objective¶: Get objective attribute.

property postprocess¶: Get postprocess attribute.

property q_func¶: Get q_func attribute.

property model¶: Override model getter method to handle quantization aware training case.

pre_process()¶: Prepare dataloaders, qfuncs for Component.

execute()¶: Quantization execute routinue based on strategy design.

dataset(dataset_type, *args, **kwargs)¶: Get dataset according to dataset_type.

class neural_compressor.experimental.Pruning(conf_fname_or_obj=None)¶

Bases: neural_compressor.experimental.component.Component

This is base class of pruning object.

Since DL use cases vary in the accuracy metrics (Top-1, MAP, ROC etc.), loss criteria (<1% or <0.1% etc.) and pruning objectives (performance, memory footprint etc.). Pruning class provides a flexible configuration interface via YAML for users to specify these parameters.

Parameters:: conf_fname_or_obj (string or obj) – The path to the YAML configuration file or PruningConf class containing accuracy goal, pruning objective and related dataloaders etc.

conf¶: A config dict object. Contains pruning setting parameters.

pruners¶: A list of Pruner object.

property pruning_func¶: Not support get pruning_func.

property evaluation_distributed¶: Getter to know whether need distributed evaluation dataloader.

property train_distributed¶: Getter to know whether need distributed training dataloader.

update_items_for_all_pruners(**kwargs)¶

Functions which add User-defined arguments to the original configurations.

The original config of pruning is read from a file. However, users can still modify configurations by passing key-value arguments in this function. Please note that the key-value arguments’ keys are analysable in current configuration.

prepare()¶: Functions prepare for generate_hooks, generate_pruners.

pre_process()¶: Functions called before pruning begins, usually set up pruners.

execute()¶

Functions that execute the pruning process.

Follow the working flow: evaluate the dense model -> train/prune the model, evaluate the sparse model.

generate_hooks()¶: Register hooks for pruning.

generate_pruners()¶: Functions that generate pruners and set up self.pruners.

get_sparsity_ratio()¶

Functions that calculate a modules/layers sparsity.

Returns:: Three floats. elementwise_over_matmul_gemm_conv refers to zero elements’ ratio in pruning layers. elementwise_over_all refers to zero elements’ ratio in all layers in the model. blockwise_over_matmul_gemm_conv refers to all-zero blocks’ ratio in pruning layers.

class neural_compressor.experimental.Benchmark(conf_fname_or_obj=None)¶

Bases: object

Benchmark class is used to evaluate the model performance with the objective settings.

Users can use the data that they configured in YAML NOTICE: neural_compressor Benchmark will use the original command to run sub-process, which depends on the user’s code and has the possibility to run unnecessary code

property results¶: Get the results of benchmarking.

property b_dataloader¶: Get the dataloader for the benchmarking.

property b_func¶: Not support getting b_func.

property model¶: Get the model.

property metric¶: Not support getting metric.

property postprocess¶: Not support getting postprocess.

summary_benchmark()¶: Get the summary of the benchmark.

config_instance()¶: Configure the multi-instance commands and trigger benchmark with sub process.

generate_prefix(core_list)¶

Generate the command prefix with numactl.

Parameters:: core_list – a list of core indexes bound with specific instances

run_instance(mode)¶

Run the instance with the configuration.

Parameters:

mode – ‘performance’ or ‘accuracy’
set ('performance' mode runs benchmarking with numactl on specific cores and instances) – by user config and returns model performance
accuracy ('accuracy' mode runs benchmarking with full cores and returns model) –

class neural_compressor.experimental.Graph_Optimization(conf_fname_or_obj=None)¶

Graph_Optimization class.

automatically searches for optimal quantization recipes for low precision model inference, achieving best tuning objectives like inference performance within accuracy loss constraints. Tuner abstracts out the differences of quantization APIs across various DL frameworks and brings a unified API for automatic quantization that works on frameworks including tensorflow, pytorch and mxnet. Since DL use cases vary in the accuracy metrics (Top-1, MAP, ROC etc.), loss criteria (<1% or <0.1% etc.) and tuning objectives (performance, memory footprint etc.). Tuner class provides a flexible configuration interface via YAML for users to specify these parameters.

Parameters:: conf_fname_or_obj (string or obj) – The path to the YAML configuration file or Graph_Optimization_Conf class containing accuracy goal, tuning objective and preferred calibration & quantization tuning space etc.

property precisions¶: Get precision.

property input¶: Get input.

property output¶: Get output.

property eval_dataloader¶: Get eval_dataloader.

property model¶: Get model.

property metric¶: Get metric.

property postprocess¶: Get postprocess.

property eval_func¶: Get evaluation function.

dataset(dataset_type, *args, **kwargs)¶: Get dataset.

set_config_by_model(model_obj)¶: Set model config.

class neural_compressor.experimental.MixedPrecision(conf_fname_or_obj=None)¶

Bases: neural_compressor.experimental.graph_optimization.GraphOptimization

Class used for generating low precision model.

MixedPrecision class automatically generates low precision model across various DL frameworks including tensorflow, pytorch and onnxruntime.

property precisions¶: Get private member variable precisions of MixedPrecision class.

set_config_by_model(model_obj)¶: Set member variable conf by a input model object.

class neural_compressor.experimental.ModelConversion(conf_fname_or_obj=None)¶

ModelConversion class is used to convert one model format to another.

Currently Neural Compressor only supports Quantization-aware training TensorFlow model to Default quantized model.

The typical usage is:
from neural_compressor.experimental import ModelConversion, common conversion = ModelConversion() conversion.source = ‘QAT’ conversion.destination = ‘default’ conversion.model = ‘/path/to/saved_model’ q_model = conversion()

Parameters:: conf_fname_or_obj (string or obj) – Optional. The path to the YAML configuration file or Conf class containing model conversion and evaluation setting if not specifed by code.

property source¶: Return source.

property destination¶: Return destination.

property eval_dataloader¶: Return eval dataloader.

property model¶: Return model.

property metric¶: Return metric.

property postprocess¶: Check postprocess.

property eval_func¶: Return eval_func.

dataset(dataset_type, *args, **kwargs)¶

Return dataset.

Parameters:: dataset_type – dataset type
Returns:: dataset class
Return type:: class

class neural_compressor.experimental.Distillation(conf_fname_or_obj=None)¶

Bases: neural_compressor.experimental.component.Component

Distillation class derived from Component class.

Distillation class abstracted the pipeline of knowledge distillation, transfer the knowledge of the teacher model to the student model.

Parameters:: conf_fname_or_obj (string or obj) – The path to the YAML configuration file or Distillation_Conf containing accuracy goal, distillation objective and related dataloaders etc.

_epoch_ran¶: A integer indicating how much epochs ran.

eval_frequency¶: The frequency for doing evaluation of the student model in terms of epoch.

best_score¶: The best metric of the student model in the training.

best_model¶: The best student model found in the training.

property criterion¶

Getter of criterion.

Returns:: The criterion used in the distillation process.

property optimizer¶

Getter of optimizer.

Returns:: The optimizer used in the distillation process.

property teacher_model¶

Getter of the teacher model.

Returns:: The teacher model used in the distillation process.

property student_model¶

Getter of the student model.

Returns:: The student model used in the distillation process.

property train_cfg¶

Getter of the train configuration.

Returns:: The train configuration used in the distillation process.

property evaluation_distributed¶: Getter to know whether need distributed evaluation dataloader.

property train_distributed¶: Getter to know whether need distributed training dataloader.

on_post_forward(input, teacher_output=None)¶

Set or compute output of teacher model.

Deprecated.

init_train_cfg()¶: Initialize the training configuration.

create_criterion()¶: Create the criterion for training.

create_optimizer()¶: Create the optimizer for training.

prepare()¶: Prepare hooks.

pre_process()¶

Preprocessing before the disillation pipeline.

Initialize necessary parts for distillation pipeline.

execute()¶

Do distillation pipeline.

First train the student model with the teacher model, after training, evaluating the best student model if any.

Returns:: Best distilled model found.

generate_hooks()¶

Register hooks for distillation.

Register necessary hooks for distillation pipeline.

class neural_compressor.experimental.NAS¶

Bases: object

Create object of different NAS approaches.

Parameters:: conf_fname_or_obj (string or obj) – The path to the YAML configuration file or the object of NASConfig.
Returns:: An object of specified NAS approach.

neural_compressor.experimental¶

Subpackages¶

Submodules¶

Package Contents¶

Classes¶

Attributes¶

`neural_compressor.experimental`¶