neural_compressor.pruner.patterns
¶
pruning patterns.
Module Contents¶
Classes¶
Pruning Pattern. |
|
Pruning Pattern. |
|
Pruning Pattern. |
Functions¶
|
Class decorator used to register a Pattern subclass to the registry. |
|
Get registered pattern class. |
- neural_compressor.pruner.patterns.register_pattern(name)¶
Class decorator used to register a Pattern subclass to the registry.
Decorator function used before a Pattern subclasses. Make sure that this Pattern class can be registered in PATTERNS.
- Parameters:
name – A string defining the pattern type name to be used in a pruning process.
- Returns:
The class of register.
- Return type:
cls
- neural_compressor.pruner.patterns.get_pattern(config, modules)¶
Get registered pattern class.
Get a Pattern object from PATTERNS.
- Parameters:
config – A config dict object that contains the pattern information.
modules – Torch neural network modules to be pruned with the pattern.
- Returns:
A Pattern object.
- Raises:
AssertionError – Currently only support patterns which have been registered in PATTERNS.
- class neural_compressor.pruner.patterns.BasePattern(config, modules)¶
Pruning Pattern.
It defines the basic pruning unit and how this unit will be pruned during pruning, e.g. 4x1, 2:4.
- Parameters:
config – A config dict object that contains the pattern information.
modules – Torch neural network modules to be pruned with the pattern.
- pattern¶
A config dict object that includes information of the pattern.
- is_global¶
A bool determining whether the pruning takes global pruning option. Global pruning means that pruning scores by a pruning criterion are evaluated in all layers. Local pruning, by contrast, means that pruning scores by the pruning criterion are evaluated
in every layer individually.
- keep_mask_layers¶
A dict that includes the layers whose mask will not be updated.
- invalid_layers¶
The layers whose shapes don’t fit the pattern.
- modules¶
Torch neural network modules to be pruned with the pattern.
- config¶
A config dict object that contains all the information including the pattern’s.
- max_sparsity_ratio_per_op¶
A float representing the maximum sparsity that one layer could reach.
- min_sparsity_ratio_per_op¶
A float representing the minimum sparsity that one layer could reach.
- target_sparsity¶
A float representing the sparsity ratio of the modules after pruning.
- reduce_tensor(data, dim)¶
Reduce the data along the given dimension.
- Parameters:
data – The input data.
dim – The reduced axis.
- Returns:
The reduced tensor.
- get_masks(scores, target_sparsity_ratio, pre_masks)¶
Generate the weight masks according to the weight score and the current target sparsity ratio.
- Parameters:
scores – A dict{“layer_name”: Tensor} that stores the pruning scores of weights.
target_sparsity_ratio – A float representing the sparsity of the modules after pruning.
pre_masks – A dict{“layer_name”: Tensor} that stores the masks generated at last pruning step.
- Returns:
- A dict with the identical size as pre_masks and its 0/1 values are updated.
1 means unpruned and 0 means pruned.
- abstract get_masks_global(scores, target_sparsity_ratio, pre_masks)¶
Generate the weight masks for global pruning, please refer to function get_masks for more information.
- get_masks_local(scores, target_sparsity_ratio, pre_masks)¶
Generate the weight masks for local pruning.
- Parameters:
scores – A dict{“layer_name”: Tensor} that stores the pruning scores of weights.
target_sparsity_ratio – A float. After pruning, the sparsity of the modules will reach this value.
pre_masks – A dict{“layer_name”: Tensor}. The previous masks generated at the last pruning step.
- Returns:
- A dict with the identical size as pre_masks and its 0/1 values are updated.
1 means unpruned and 0 means pruned.
- get_single_mask_per_target_ratio(score, exact_sparsity_ratio)¶
Generate a mask for one layer with the exact_sparsity_ratio.
- Parameters:
score – A Tensor representing the pruning scores of each weight elements.
exact_sparsity_ratio – A float representing the layer’s final sparsity ratio.
- Returns:
A Tensor with the identical size as score. a new mask.
- abstract get_block_size_dict(data)¶
Get pattern size for each module.
This is mainly for per-channel pruning when each module has different pruning size.
- Parameters:
data – the input data.
- Returns:
To be implemented in subclasses.
- get_sparsity_ratio(pre_masks, return_dict=False)¶
Calculate the zero elements’ ratio in pre_masks.
- Parameters:
pre_masks – Dict{“layer_name”: Tensor} that stores the masks generated after the last pruning step.
return_dict – A bool determining whether to return more information like zero_cnt and total_cnt.
- Returns:
A float representing the zero elements’ ratio in pre_masks.
- get_pattern_lock_masks(modules)¶
Obtain masks from original weight map according the pattern and weights’ zero positions.
- Parameters:
modules – a dict {‘layer_name’: Tensor} that stores weights.
- Returns:
A dict with the identical size as modules, containing pattern lock masks.
- check_layer_validity()¶
Check if a layer is valid for this block_size.
- abstract get_reduced_masks_from_data(data, key)¶
Obtain the unpruned weights and reshape according to the block_size.
- update_residual_cnt(masks, target_sparsity_ratio)¶
Update the number of parameters yet to be pruned.
- Parameters:
masks – the current pruning mask.
target_sparsity_ratio – A float representing the final sparsity of the modules.
- Returns:
An int representing the number of weights left to be pruned to reach the target sparsity ratio.
- get_sparsity_ratio_each_layer(masks)¶
Calculate the sparsity ratio of each layer.
- Parameters:
masks – The current weight masks.
- Returns:
the sparsity information for each layer including sparsity_ratio, zero_point and total cnts. SparsityInfo: the sparsity information for the model.
- Return type:
infos
- adjust_ratio(masks: dict, layer_name: str, key_new_sparsity: SparsityInfo, max_sparsity_ratio: float, min_sparsity_ratio: float, final_target_sparsity_ratio: float)¶
Adjust the sparsity of a layer based on threshold.
- Parameters:
masks – The weight masks.
layer_name – The layer to be examined.
key_new_sparsity – The proposed ratio for the layer.
max_sparsity_ratio – A float representing the maximum sparsity that one layer could reach.
min_sparsity_ratio – A float representing the minimum sparsity that one layer could reach.
final_target_sparsity_ratio – The final target sparsity ratio.
- Returns:
A bool indicating if the ratio needs to be adjusted. adjust_sparsity_ratio: The adjusted sparsity ratio.
- class neural_compressor.pruner.patterns.PatternNxM(config, modules)¶
Bases:
BasePattern
Pruning Pattern.
A Pattern class derived from BasePattern. In this pattern, the weights in a NxM block will be pruned or kept during one pruning step.
- Parameters:
config – A config dict object that contains the pattern information.
- block_size¶
A list of two integers representing the height and width of the block.
- Please note that the vertical direction of a Linear layer's weight refers to the output channel.
because PyTorch’s tensor matmul has a hidden transpose operation.
- get_block_size_dict()¶
Calulate the zero elements’ ration in pre_masks.
- Parameters:
data – Dict{“layer_name”: Tensor} that stores weights or scores.
- Returns:
- [block_size_1, block_size_2]} containing block shapes of each layer.
In channel-wise pruning different layers can have different pruning patterns.
- Return type:
A dict. Dict{“layer_name”
- check_layer_validity()¶
Check if a layer is valid for this block_size.
- get_reduced_masks_from_data(data, key)¶
Obtain the unpruned weights and reshape according to the block_size.
- Parameters:
data – Input.
key – The layer name.
- Returns:
The unpruned weights.
- get_sparsity_ratio(pre_masks, return_dict=False)¶
Please note that the zero cnt and total cnt are all block_wise for supporting channel-wise pruning.
- Parameters:
pre_masks – Dict{“layer_name”: Tensor} representing the masks generated after the last pruning step.
return_dict – A bool determining whether to return more information like zero_cnt and total_cnt.
- Returns:
A float representing the zero elements’ ratio in pre_masks.
- get_sparsity_ratio_progressive(pre_masks, return_dict=False)¶
Calculate the sparsity ratio of each layer.
- Parameters:
pre_masks – Dict{“layer_name”: Tensor} that stores the masks generated after the last pruning step.
return_dict – A bool determining whether to return more information like zero_cnt and total_cnt.
- Returns:
A float representing the zero elements’ ratio in pre_masks.
- reshape_orig_to_pattern(data, key)¶
Reshape the data(s1,s2) to [s1/N,N,s2,s2/M].
- Parameters:
data – The input.
key – The layer name.
- Returns:
Reshaped input tensor.
- reshape_reduced_to_orig(data, key, orig_shape)¶
Reshape the data [s1/N,s2/M] to [s1,s2], also permute dims for conv layer.
- Parameters:
data – Input.
key – The layer name.
orig_shape – The original shape of the layer.
- Returns:
Data of its original shape.
- reduce_scores(scores)¶
Recalculate the pruning scores after reducing the data.
- Parameters:
scores – A dict{“layer_name”: Tensor} that stores the pruning scores of weights.
- Returns:
The reduced pruning scores.
- get_mask_per_threshold(score, threshold, block_size)¶
Get the mask per threshold.
- get_masks_global(scores, cur_target_sparsity_ratio, pre_masks, keep_exact_sparsity_ratio=True)¶
Generate masks for layers.
Gather all layer’s scores together and calculate a common threshold. This threshold will be applied to all layers.
- Parameters:
scores – A dict{“layer_name”: Tensor} that stores the pruning scores of weights.
cur_target_sparsity_ratio – A float representing the model’s sparsity after pruning.
pre_masks – A dict{“layer_name”: Tensor} that stores the masks generated at the last pruning step.
max_sparsity_ratio_per_op – A float representing the maximum sparsity that one layer can reach.
keep_pre_masks – A bool representing if the masks should remain unchanged.
- Returns:
- A dict with the identical size as pre_masks and its 0/1 values are updated.
1 means unpruned and 0 means pruned.
- get_pattern_lock_masks(modules)¶
Obtain masks from original weight map by masking the zero-valued weights.
- Parameters:
modules – A dict{“layer_name”: Tensor} that stores weights.
- Returns:
A dict with the identical size as modules, containing pattern lock masks.
- count_new_masked_cnts(new_added_masks)¶
Count the number of elements to be masked.
- Parameters:
new_added_masks – A dict {“layer_name”: Tensor} that stores the added masks.
- Returns:
The number of masked weights.
- update_new_added_masks(pre_masks, cur_masks)¶
Obtain the new set-to-zero masks during a pruning procedure.
Pre_masks, cur_masks should have identical keys bacause they represent the same model.
- Parameters:
pre_masks – Dict{“layer_name”: Tensor} that stores the masks generated after the last pruning step.
cur_masks – Dict{“layer_name”: Tensor} that stores the current masks.
- Returns:
Tensor} that stores the added masks.
- Return type:
A dict {“layer_name”
- update_progressive_masks(pre_masks, cur_masks, scores, progressive_step, progressive_configs)¶
Generate the progressive masks.
- Parameters:
pre_masks – Dict{“layer_name”: Tensor} that stores the masks generated after the last pruning step.
cur_masks – Dict{“layer_name”: Tensor} that stores the current masks.
scores – A dict{“layer_name”: Tensor} that stores the pruning scores of weights.
progressive_step – An integer representing the number of current step in progressive pruning.
progressive_configs – A dict that stores configurations of progressive pruning.
- Returns:
Tensor} that stores the masks generated in progressive pruning.
- Return type:
A dict{“layer_name”
- update_progressive_masks_linear(pre_masks, cur_masks, progressive_step, progressive_configs)¶
Generate the progressive masks along the block’s larger dimension.
- Parameters:
pre_masks – Dict{“layer_name”: Tensor} that stores the masks generated after the last pruning step.
cur_masks – Dict{“layer_name”: Tensor} that stores the current masks.
progressive_step – An integer representing the number of current step in progressive pruning.
progressive_configs – A dict that stores configurations of progressive pruning.
- Returns:
Tensor} that stores the masks generated in progressive pruning.
- Return type:
A dict{“layer_name”
- update_progressive_masks_scores(pre_masks, cur_masks, scores, progressive_step, progressive_configs)¶
Generate the progressive masks based on scores.
- Parameters:
pre_masks – Dict{“layer_name”: Tensor} that stores the masks generated after the last pruning step.
cur_masks – Dict{“layer_name”: Tensor} that stores the current masks.
scores – A dict{“layer_name”: Tensor} that stores the pruning scores of weights.
progressive_step – An integer representing the number of current step in progressive pruning.
progressive_configs – A dict that stores configurations of progressive pruning.
- Returns:
Tensor} that stores the masks generated in progressive pruning.
- Return type:
A dict{“layer_name”
- update_progressive_masks_local(pre_masks, cur_masks, scores, progressive_step, progressive_configs)¶
Generate progressive masks in a local pruning domain.
- Parameters:
pre_masks – Dict{“layer_name”: Tensor} that stores the masks generated after the last pruning step.
cur_masks – Dict{“layer_name”: Tensor} that stores the current masks.
scores – A dict{“layer_name”: Tensor} that stores the pruning scores of weights.
progressive_step – An integer representing the number of current step in progressive pruning.
progressive_configs – A dict that stores configurations of progressive pruning.
- Returns:
Tensor} that stores the masks generated in progressive pruning.
- Return type:
A dict{“layer_name”
- update_progressive_masks_global(pre_masks, cur_masks, scores, progressive_step, progressive_configs)¶
Gather all layer’s scores to obtain a threshold that would be applied to all layers.
- Parameters:
pre_masks – Dict{“layer_name”: Tensor} that stores the masks generated after the last pruning step.
cur_masks – Dict{“layer_name”: Tensor} that stores the current masks.
scores – A dict{“layer_name”: Tensor} that stores the pruning scores of weights.
progressive_step – An integer representing the number of current step in progressive pruning.
progressive_configs – A dict that stores configurations of progressive pruning.
- Returns:
Tensor} that stores the masks generated in progressive pruning.
- Return type:
A dict{“layer_name”
- class neural_compressor.pruner.patterns.PatternNInM(config, modules)¶
Bases:
BasePattern
Pruning Pattern.
A Pattern class derived from Pattern. In this pattern, N out of every M continuous weights will be pruned. For more info of this pattern, please refer to : https://github.com/intel/neural-compressor/blob/master/docs/sparsity.md
- Parameters:
config – A config dict object that contains the pattern information.
- N¶
The number of elements to be pruned in a weight sequence.
- M¶
The size of the weight sequence.
- check_layer_validity(datas: dict, block_size: tuple)¶
Check if a layer is valid for this block_size.
- Parameters:
datas – A dict object containing the weights for all layers.
block_size – A tuple representing the size of the pattern block.
- get_reduced_masks_from_data(data, key)¶
Obtain the unpruned weights and reshape according to the block_size.
- Parameters:
data – Input.
key – The layer name.
- Returns:
A tensor representing the unpruned weights.
- get_least_ninm_mask_from_data(score)¶
Generate the least N scores in M.
- Parameters:
score – the pruning scores of weights.
- Returns:
- A dict with the identical size as pre_masks and its 0/1 values are updated.
1 means unpruned and 0 means pruned.
- get_sparsity_ratio(pre_masks, return_dict=False)¶
Please note that the zero cnt and total cnt are all block_wise for supporting channel-wise pruning.
The return sparsity ratio is elementwised.
- Parameters:
pre_masks – Dict{“layer_name”: Tensor} that stores the masks generated after the last pruning step.
return_dict – A bool determining whether to return more information like zero_cnt and total_cnt.
- Returns:
An elementwise sparisty ratio.
- reshape_orig_to_pattern(data, key)¶
Reshape the data based on the pruning pattern.
- Parameters:
data – Input.
key – layer name.
- Returns:
Reshaped data.
- reshape_reduced_to_orig(data, key, orig_shape)¶
Reshape the reduced data to its original shape.
- Parameters:
data – Input.
key – The layer name.
orig_shape – The original shape of the layer.
- Returns:
Data of its original shape.
- reduce_scores(scores)¶
Calculate the pruning scores after reducing the data and obtain the least N scores in M.
- Parameters:
scores – Pruning scores of weights.
- Returns:
Updated pruning scores and the least N scores in M.
- get_ele_mask_per_threshold(score, threshold, block_size, least_ninm_mask)¶
Get the elementwise mask per threshold.
- Parameters:
score – A tensor that stores the pruning scores of weights.
threshold – A float used to determine whether to prune a weight.
block_size – A list of two integers representing the height and width of the block.
least_m_in_m_masks – A tensor representing the least N scores in M.
- Returns:
The elementwise pruning mask.
- Return type:
mask
- get_masks_global(scores, cur_target_sparsity_ratio, pre_masks, keep_exact_sparsity_ratio=True)¶
Generate masks for layers.
Gather all layer’s scores together and calculate a common threshold. This threshold will be applied for all layers.
- Parameters:
scores – A dict{“layer_name”: Tensor} that stores the pruning scores of weights.
target_sparsity_ratio – A float representing the model’s final sparsity.
pre_masks – A dict{“layer_name”: Tensor} representing the masks generated after the last pruning step.
max_sparsity_ratio_per_op – A float representing the maximum sparsity that one layer can reach.
- Returns:
- A dict with the identical size as pre_masks and its 0/1 values are updated.
1 means unpruned and 0 means pruned.
- get_pattern_lock_masks(modules)¶
Obtain masks from original weight map, by masking where weights’ are zero.
- Parameters:
modules – A dict{“layer_name”: Tensor} that stores weights.
- Returns:
A dict with the identical size as modules, containing pattern lock masks.