neural_compressor.compression.pruner.model_slim.pattern_analyzer

Analyze.

Module Contents

Classes

RecipeSearcher

Searcher class which searches patterns with a pre-defined recipe.

JitBasicSearcher

Static graph searcher class which searches patterns with PyTorch static graph and its input/output information.

Linear2LinearSearcher

Static graph searcher for consecutive linear layers.

SelfMHASearcher

Static graph searcher for multi-head attention modules.

ClassifierHeadSearcher

Static graph searcher for multi-head attention modules.

ClassifierHeadSearcherTF

Static graph searcher for multi-head attention modules.

Functions

get_attributes(module, attrs)

Get a multi-level descent module of module.

get_common_module(layer1, layer2)

Get the module which contains layer1 and layer2 (nearest father nodes)

print_iterables(data_iters)

Print the auto slim logs.

neural_compressor.compression.pruner.model_slim.pattern_analyzer.get_attributes(module, attrs: str)[source]

Get a multi-level descent module of module.

Parameters:
  • module (torch.nn.Module) – The torch module.

  • attrs (str) – The attributes’ calling path.

Returns:

The target attribute of the module.

Return type:

attr

neural_compressor.compression.pruner.model_slim.pattern_analyzer.get_common_module(layer1: str, layer2: str)[source]

Get the module which contains layer1 and layer2 (nearest father nodes)

neural_compressor.compression.pruner.model_slim.pattern_analyzer.print_iterables(data_iters)[source]

Print the auto slim logs.

class neural_compressor.compression.pruner.model_slim.pattern_analyzer.RecipeSearcher(model, recipe: dict)[source]

Searcher class which searches patterns with a pre-defined recipe.

A Recipe is a dict type data which contains the root module’s name and its sub-modules’ levelwise calling way. For example, for the self-attention module in Huggingface bert-model, if we want to obtain its linear ops (query, key, value and output), the recipe should be like: recipe_samples = {

‘BertAttention’: [“self.query”, “self.key”, “self.value”, “output.dense”]

}

Parameters:
  • model (torch.nn.Module) – The PyTorch model for searching.

  • recipe (dict) – A dict containing information of the searching pattern.

model[source]

The PyTorch model for searching.

recipe[source]

A dict containing information of the searching pattern.

targets[source]

The basic module’s name which contains searching pattern.

searching_results[source]

The list/dict which store matched patterns.

class neural_compressor.compression.pruner.model_slim.pattern_analyzer.JitBasicSearcher(model, dataloader=None, placeholder_shape=None, placeholder_dtype=None)[source]

Static graph searcher class which searches patterns with PyTorch static graph and its input/output information.

By converting a PyTorch Model into a static version using torch.jit.trace()/script(), we can trace some special pattern in the model and optimize them automatically. This class provide some basic functions for jit searcher Including generating dummy inputs, generating static graph, analyzing static graph.

Parameters:

model (torch.nn.Module) – The PyTorch model for searching.

model[source]

The PyTorch model for searching.

device[source]

The model’s current device type.

static_graph[source]

The static graph of original model.

flatten_static_graph[source]

A list of string with the model’s static graph inference details.

target_layers[source]

The layer types the searcher will extract.

searching_results[source]

The list/dict which store matched patterns.

class neural_compressor.compression.pruner.model_slim.pattern_analyzer.Linear2LinearSearcher(model, dataloader=None, placeholder_shape=None, placeholder_dtype=None)[source]

Static graph searcher for consecutive linear layers.

Use the static graph to detect some special pattern in a module, there is no need for user to define layer name. Automatically search linear layers which can be optimized.

Parameters:

model (torch.nn.Module) – The PyTorch model for searching.

model[source]

The PyTorch model for searching.

device[source]

The model’s current device type.

static_graph[source]

The static graph of original model.

flatten_static_graph[source]

A list of string with the model’s static graph inference details.

target_layers[source]

The layer types the searcher will extract.

searching_results[source]

The list/dict which store matched patterns.

target_op_lut[source]

a lookup table for target operators and their corresponding jit codes.

current_pattern[source]

a searching path to store searching status.

class neural_compressor.compression.pruner.model_slim.pattern_analyzer.SelfMHASearcher(model, dataloader=None, placeholder_shape=None, placeholder_dtype=None)[source]

Static graph searcher for multi-head attention modules.

Use the static graph to detect some special pattern in a module, there is no need for user to define layer name. Automatically search multi-head attention modules which can be optimized.

Parameters:

model (torch.nn.Module) – The PyTorch model for searching.

model[source]

The PyTorch model for searching.

device[source]

The model’s current device type.

static_graph[source]

The static graph of original model.

flatten_static_graph[source]

A list of string with the model’s static graph inference details.

class neural_compressor.compression.pruner.model_slim.pattern_analyzer.ClassifierHeadSearcher(model)[source]

Static graph searcher for multi-head attention modules.

Use the static graph to detect final classifier head in a module, there is no need for user to define layer name. Automatically search multi-head attention modules which can be optimized.

Parameters:

model (torch.nn.Module) – The PyTorch model for searching.

model[source]

The PyTorch model for searching.

device[source]

The model’s current device type.

static_graph[source]

The static graph of original model.

flatten_static_graph[source]

A list of string with the model’s static graph inference details.

class neural_compressor.compression.pruner.model_slim.pattern_analyzer.ClassifierHeadSearcherTF(model)[source]

Static graph searcher for multi-head attention modules.

Use the static graph to detect final classifier head in a module, there is no need for user to define layer name. Automatically search multi-head attention modules which can be optimized.

Parameters:

model (tf.keras.Model) – The Keras model for searching.

model[source]

The Keras model for searching.

device[source]

The model’s current device type.

static_graph[source]

The static graph of original model.

flatten_static_graph[source]

A list of string with the model’s static graph inference details.