neural_compressor

Intel® Neural Compressor: An open-source Python library supporting popular model compression techniques.

Subpackages

Submodules

Package Contents

Classes

Benchmark

Benchmark class can be used to evaluate the model performance.

DistillationConfig

Config of distillation.

PostTrainingQuantConfig

Config Class for Post Training Quantization.

WeightPruningConfig

Similiar to torch optimizer's interface.

QuantizationAwareTrainingConfig

Config Class for Quantization Aware Training.

Functions

set_random_seed(seed)

Set the random seed in config.

set_tensorboard(tensorboard)

Set the tensorboard in config.

set_workspace(workspace)

Set the workspace in config.

class neural_compressor.Benchmark(conf_fname_or_obj)

Bases: object

Benchmark class can be used to evaluate the model performance.

With the objective setting, user can get the data of what they configured in yaml.

Parameters:

conf_fname_or_obj (string or obj) – The path to the YAML configuration file or Benchmark_Conf class containing accuracy goal, tuning objective and preferred calibration & quantization tuning space etc.

dataloader(dataset, batch_size=1, collate_fn=None, last_batch='rollover', sampler=None, batch_sampler=None, num_workers=0, pin_memory=False, shuffle=False, distributed=False)

Set dataloader for benchmarking.

metric(name, metric_cls, **kwargs)

Set the metric class and Neural Compressor will initialize this class when evaluation.

Neural Compressor has many built-in metrics, but users can set specific metrics through this api. The metric class should take the outputs of the model or postprocess (if have) as inputs. Neural Compressor built-in metrics always take (predictions, labels) as inputs for update, and user_metric.metric_cls should be a sub_class of neural_compressor.metric.metric or an user-defined metric object

Parameters:
  • metric_cls (cls) – Should be a sub_class of neural_compressor.metric.BaseMetric, which takes (predictions, labels) as inputs

  • name (str, optional) – Name for metric. Defaults to ‘user_metric’.

postprocess(name, postprocess_cls, **kwargs)

Set postprocess class and neural_compressor will initialize this class when evaluation.

The postprocess function should take the outputs of the model as inputs, and outputs (predictions, labels) as inputs for metric updates.

Args: name (str, optional): Name for postprocess. postprocess_cls (cls): Should be a sub_class of neural_compressor.data.transforms.postprocess.

neural_compressor.set_random_seed(seed: int)

Set the random seed in config.

neural_compressor.set_tensorboard(tensorboard: bool)

Set the tensorboard in config.

neural_compressor.set_workspace(workspace: str)

Set the workspace in config.

class neural_compressor.DistillationConfig(teacher_model=None, criterion=criterion, optimizer={'SGD': {'learning_rate': 0.0001}})

Config of distillation.

Parameters:
  • teacher_model (Callable) – Teacher model for distillation. Defaults to None.

  • features (optional) – Teacher features for distillation, features and teacher_model are alternative. Defaults to None.

  • criterion (Callable, optional) – Distillation loss configure.

  • optimizer (dictionary, optional) – Optimizer configure.

property criterion

Get criterion.

property optimizer

Get optimizer.

property teacher_model

Get teacher_model.

class neural_compressor.PostTrainingQuantConfig(device='cpu', backend='default', quant_format='default', inputs=[], outputs=[], approach='static', calibration_sampling_size=[100], op_type_list=None, op_name_list=None, reduce_range=None, excluded_precisions=[], quant_level=1, tuning_criterion=tuning_criterion, accuracy_criterion=accuracy_criterion)

Bases: _BaseQuantizationConfig

Config Class for Post Training Quantization.

property approach

Get approach.

property tuning_criterion

Get tuning_criterion.

class neural_compressor.WeightPruningConfig(pruning_configs=[{}], target_sparsity=0.9, pruning_type='snip_momentum', pattern='4x1', op_names=[], excluded_op_names=[], start_step=0, end_step=0, pruning_scope='global', pruning_frequency=1, min_sparsity_ratio_per_op=0.0, max_sparsity_ratio_per_op=0.98, sparsity_decay_type='exp', pruning_op_types=['Conv', 'Linear'], **kwargs)

Similiar to torch optimizer’s interface.

property weight_compression

Get weight_compression.

class neural_compressor.QuantizationAwareTrainingConfig(device='cpu', backend='default', inputs=[], outputs=[], op_type_list=None, op_name_list=None, reduce_range=None, excluded_precisions=[], quant_level=1)

Bases: _BaseQuantizationConfig

Config Class for Quantization Aware Training.

property approach

Get approach.