`neural_compressor.compression.pruner.patterns`

pruning patterns.

Module Contents

Classes

`BasePattern`	Pruning Pattern.
`PatternNxM`	Pruning Pattern.
`PatternNInM`	Pruning Pattern.

Functions

`register_pattern`(name)	Class decorator used to register a Pattern subclass to the registry.
`get_pattern`(config, modules)	Get registered pattern class.

neural_compressor.compression.pruner.patterns.register_pattern(name)[source]

Class decorator used to register a Pattern subclass to the registry.

Decorator function used before a Pattern subclasses. Make sure that this Pattern class can be registered in PATTERNS.

Parameters:: name – A string defining the pattern type name to be used in a pruning process.
Returns:: The class of register.
Return type:: cls

neural_compressor.compression.pruner.patterns.get_pattern(config, modules)[source]

Get registered pattern class.

Get a Pattern object from PATTERNS.

Parameters:

config – A config dict object that contains the pattern information.
modules – Torch neural network modules to be pruned with the pattern.

Returns:

A Pattern object.

Raises:

AssertionError – Currently only support patterns which have been registered in PATTERNS.

class neural_compressor.compression.pruner.patterns.BasePattern(config, modules)[source]

Pruning Pattern.

It defines the basic pruning unit and how this unit will be pruned during pruning, e.g. 4x1, 2:4.

Parameters:

config – A config dict object that contains the pattern information.
modules – Torch neural network modules to be pruned with the pattern.

pattern[source]: A config dict object that includes information of the pattern.

is_global[source]: A bool determining whether the pruning takes global pruning option. Global pruning means that pruning scores by a pruning criterion are evaluated in all layers. Local pruning, by contrast, means that pruning scores by the pruning criterion are evaluated

in every layer individually.

keep_mask_layers[source]: A dict that includes the layers whose mask will not be updated.

invalid_layers[source]: The layers whose shapes don’t fit the pattern.

modules[source]: Torch neural network modules to be pruned with the pattern.

config[source]: A config dict object that contains all the information including the pattern’s.

max_sparsity_ratio_per_op[source]: A float representing the maximum sparsity that one layer could reach.

min_sparsity_ratio_per_op[source]: A float representing the minimum sparsity that one layer could reach.

target_sparsity[source]: A float representing the sparsity ratio of the modules after pruning.

class neural_compressor.compression.pruner.patterns.PatternNxM(config, modules)[source]

Pruning Pattern.

A Pattern class derived from BasePattern. In this pattern, the weights in a NxM block will be pruned or kept during one pruning step.

Parameters:: config – A config dict object that contains the pattern information.

block_size[source]: A list of two integers representing the height and width of the block.

Please note that the vertical direction of a Linear layer's weight refers to the output channel.: because PyTorch’s tensor matmul has a hidden transpose operation.

class neural_compressor.compression.pruner.patterns.PatternNInM(config, modules)[source]

Pruning Pattern.

A Pattern class derived from Pattern. In this pattern, N out of every M continuous weights will be pruned. For more info of this pattern, please refer to : https://github.com/intel/neural-compressor/blob/master/docs/sparsity.md

Parameters:: config – A config dict object that contains the pattern information.

N[source]: The number of elements to be pruned in a weight sequence.

M[source]: The size of the weight sequence.

neural_compressor.compression.pruner.patterns

Module Contents

Classes

Functions

`neural_compressor.compression.pruner.patterns`