neural_compressor.experimental.pytorch_pruner.patterns

pattern module.

Module Contents

Classes

Pattern

Pruning Pattern.

PatternNxM

Pruning Pattern.

PatternNInM

Pruning Pattern.

Functions

register_pattern(name)

Class decorator used to register a Pattern subclass to the registry.

get_pattern(config)

Get registered pattern class.

neural_compressor.experimental.pytorch_pruner.patterns.register_pattern(name)

Class decorator used to register a Pattern subclass to the registry.

Decorator function used before a Pattern subclasses. Make sure that this Pattern class can be registered in PATTERNS.

Parameters:
  • cls (class) – The class of register.

  • name – A string. Define the pattern type which will be included in a pruning process.

Returns:

The class of register.

Return type:

cls

neural_compressor.experimental.pytorch_pruner.patterns.get_pattern(config)

Get registered pattern class.

Get a Pattern object from PATTERNS.

Parameters:

config – A config dict object. Contains the pattern information.

Returns:

A Pattern object.

Raises:

AssertionError – Currently only support patterns which have been registered in PATTERNS.

class neural_compressor.experimental.pytorch_pruner.patterns.Pattern(config)

Pruning Pattern.

Every Pruner object will contain a Pattern object. It defines the basic pruning unit and how this unit will be pruned during pruning.

Parameters:

config – A config dict object. Contains the pattern information.

pattern

A config dict object. The pattern related part in args config.

is_global

A bool. Whether the pruning take global pruning option. Global pruning means that all pruning layers are gathered to calculate pruning criteria. Local pruning, on the contrast, means that pruning layers are to calculate criteria individually.

get_masks(scores, target_sparsity_ratio, pre_masks, max_sparsity_ratio_per_layer)

Call when new masks for pruning are to be calculated.

Parameters:
  • scores – A dict{“layer_name”: Tensor}. Store the pruning scores of weights.

  • target_sparsity_ratio – A float. After pruning, the model’s sparsity will reach this value.

  • pre_masks – A dict{“layer_name”: Tensor}. The masks generated after the last pruning step.

  • max_sparsity_ratio_per_layer – A float. The maximum sparsity that one layer can reach.

Returns:

A dict with the identical size as pre_masks. Update the 0/1 values in it.

abstract get_masks_global(scores, target_sparsity_ratio, pre_masks, max_sparsity_ratio_per_layer)

To be implemented in subclasses.

get_mask_single(score, exact_sparsity_ratio)

Obtain a mask for one layer.

Parameters:
  • score – A Tensor. Store the pruning scores of one layer.

  • exact_sparsity_ratio – A float. After pruning, the layer’s sparsity will reach this value.

Returns:

A Tensor with the identical size as score. a new mask.

abstract get_block_size_dict(data)

To be implemented in subclasses.

get_masks_local(scores, target_sparsity_ratio, pre_masks, max_sparsity_ratio_per_layer)

Obtain layers’ local masks.

Parameters:
  • scores – A dict{“layer_name”: Tensor}. Store the pruning scores of weights.

  • target_sparsity_ratio – A float. After pruning, the model’s sparsity will reach this value.

  • pre_masks – A dict{“layer_name”: Tensor}. The masks generated after the last pruning step.

  • max_sparsity_ratio_per_layer – A float. The maximum sparsity that one layer can reach.

Returns:

A dict with the identical size as pre_masks. Update the 0/1 values in it.

get_sparsity_ratio(pre_masks)

Calulate the zero elements’ ration in pre_masks.

Parameters:

pre_masks – Dict{“layer_name”: Tensor}. The masks generated after the last pruning step.

Returns:

A float. The zero elements’ ratio in pre_masks.

get_pattern_lock_masks(modules)

Obtain masks from original weight map, by masking where weights’ are zero.

Parameters:

modules – A dict{“layer_name”: Tensor}. Store weights.

Returns:

A dict with the identical size as modules, containing pattern lock masks.

class neural_compressor.experimental.pytorch_pruner.patterns.PatternNxM(config)

Bases: Pattern

Pruning Pattern.

A Pattern class derived from Pattern. In this pattern, the weights in a NxM block will be pruned or kept during one pruning step.

Parameters:

config – A config dict object. Contains the pattern information.

block_size

A list of two Integers. The height and width of the block. Please be aware that the vertical direction of a Linear layer’s weight in PyTorch refer to output channel. Because PyTorch’s tensor matmul has a hidden transpose operation.

get_block_size_dict(data)

Calulate the zero elements’ ration in pre_masks.

Parameters:

data – Dict{“layer_name”: Tensor}. Store weights or scores.

Returns:

[block_size_1, block_size_2]}.

Containing layers’ corresponding pruning pattern’s block shape. Please be aware that because in channel-wise pruning, different layers can have different pruning patterns.

Return type:

A dict. Dict{“layer_name”

get_sparsity_ratio(pre_masks)

Calulate the zero elements’ ration in pre_masks.

Parameters:

pre_masks – Dict{“layer_name”: Tensor}. The masks generated after the last pruning step.

Returns:

A float. Calculate the zero elements’ ratio in pre_masks.

get_masks_global(scores, target_sparsity_ratio, pre_masks, max_sparsity_ratio_per_layer, keep_pre_mask=False)

Generate masks for layers.

Gather all layer’s scores together and calculate a common threshold. This threshold will be applied for all layers.

Parameters:
  • scores – A dict{“layer_name”: Tensor}. Store the pruning scores of weights.

  • target_sparsity_ratio – A float. After pruning, the model’s sparsity will reach this value.

  • pre_masks – A dict{“layer_name”: Tensor}. The masks generated after the last pruning step.

  • max_sparsity_ratio_per_layer – A float. The maximum sparsity that one layer can reach.

  • keep_pre_masks – A bool. If True, keep the masks unchanged.

Returns:

A dict with the identical size as pre_masks. Update the 0/1 values in it.

get_pattern_lock_masks(modules)

Obtain masks from original weight map, by masking where weights’ are zero.

Parameters:

modules – A dict{“layer_name”: Tensor}. Store weights.

Returns:

A dict with the identical size as modules, containing pattern lock masks.

class neural_compressor.experimental.pytorch_pruner.patterns.PatternNInM(config)

Bases: Pattern

Pruning Pattern.

A Pattern class derived from Pattern. In this pattern, N out of every M continuous weights will be pruned. For more info of this pattern, please refer to https://github.com/intel/neural-compressor/blob/master/docs/pruning.md

Parameters:

config – A config dict object. Contains the pattern information.

N

The number of elements to be prune in a weight sequence.

M

The size of the weight sequence.

get_sparsity_ratio(pre_masks)

Calulate the zero elements’ ration in pre_masks.

Parameters:

pre_masks – Dict{“layer_name”: Tensor}. The masks generated after the last pruning step.

Returns:

A float. Calculate the zero elements’ ratio in pre_masks.

get_masks_global(scores, target_sparsity_ratio, pre_masks, max_sparsity_ratio_per_layer)

Generate masks for layers.

Gather all layer’s scores together and calculate a common threshold. This threshold will be applied for all layers.

Parameters:
  • scores – A dict{“layer_name”: Tensor}. Store the pruning scores of weights.

  • target_sparsity_ratio – A float. After pruning, the model’s sparsity will reach this value.

  • pre_masks – A dict{“layer_name”: Tensor}. The masks generated after the last pruning step.

  • max_sparsity_ratio_per_layer – A float. The maximum sparsity that one layer can reach.

Returns:

A dict with the identical size as pre_masks. Update the 0/1 values in it.

get_pattern_lock_masks(modules)

Obtain masks from original weight map, by masking where weights’ are zero.

Parameters:

modules – A dict{“layer_name”: Tensor}. Store weights.

Returns:

A dict with the identical size as modules, containing pattern lock masks.