`neural_compressor.torch.algorithms.static_quant.utility`

Module Contents

Classes

`Statistics`	The statistics printer.
`TransformerBasedModelBlockPatternDetector`	Detect the attention block and FFN block in transformer-based model.

Functions

`check_cfg_and_qconfig`(user_cfg, cfgs, ...)	Check configs and quantization configs.
`generate_activation_observer`(scheme, algorithm[, ...])	This is a helper method to generate a dict containing activation observer info.
`get_quantizable_ops_recursively`(model, example_inputs)	Get all quantizable ops from model.
`simple_inference`(q_model, example_inputs[, iterations])	The function is used for ipex warm-up inference.
`dump_model_op_stats`(user_cfg)	This is a function to dump quantizable ops of model to user.
`get_depth`(→ int)	Query the depth of the dict.
`get_dict_at_depth`(d, target_depth, result[, depth])	Get all sub-dicts that are at a specified depth in a nested dict.
`get_element_under_depth`(d, ops_lst)	Get all values in a nested dict.
`paser_cfgs`(cfgs)	Parse configs.
`get_quantizable_ops_from_cfgs`(ops_name, ...)	Get quantizable ops from configs, combine fused ops as one op.

neural_compressor.torch.algorithms.static_quant.utility.check_cfg_and_qconfig(user_cfg, cfgs, op_infos_from_cfgs, output_tensor_ids_op_name)[source]

Check configs and quantization configs.

Parameters:

user_cfg (dict) – quantization configuration for ops.
cfgs (dict) – configs loaded from ipex config path.
op_infos_from_cfgs (dict) – dict containing configs that have been parsed for each op.
output_tensor_ids_op_name (dict) – dict containing op names corresponding to ‘op_infos_from_cfgs’.

Returns:

updated configs.

Return type:

cfgs (dict)

neural_compressor.torch.algorithms.static_quant.utility.generate_activation_observer(scheme, algorithm, smooth_quant=False, smooth_quant_enable=False)[source]

This is a helper method to generate a dict containing activation observer info.

Parameters:

scheme (str) – Quantization scheme to be used.
algorithm (str) – What algorithm for computing the quantization parameters based on.

Returns:

A dict containing observer info.zs

neural_compressor.torch.algorithms.static_quant.utility.get_quantizable_ops_recursively(model, example_inputs)[source]

Get all quantizable ops from model.

Parameters:

model (object) – input model
example_inputs (dict|list|tuple|torch.Tensor) – used to trace torch model.

Returns:

list of tuples of op_name and op_type. cfgs (dict): dict of configuration

Return type:

quantizable_ops (list)

neural_compressor.torch.algorithms.static_quant.utility.simple_inference(q_model, example_inputs, iterations=1)[source]: The function is used for ipex warm-up inference.

neural_compressor.torch.algorithms.static_quant.utility.dump_model_op_stats(user_cfg)[source]

This is a function to dump quantizable ops of model to user.

Parameters:: user_cfg (dict) – quantization config
Returns:: None

neural_compressor.torch.algorithms.static_quant.utility.get_depth(d) → int[source]: Query the depth of the dict.

neural_compressor.torch.algorithms.static_quant.utility.get_dict_at_depth(d, target_depth, result, depth=0)[source]: Get all sub-dicts that are at a specified depth in a nested dict.

neural_compressor.torch.algorithms.static_quant.utility.get_element_under_depth(d, ops_lst)[source]: Get all values in a nested dict.

neural_compressor.torch.algorithms.static_quant.utility.paser_cfgs(cfgs)[source]

Parse configs.

Parameters:: cfgs (dict) – the input configs.
Returns:: list of op names. tune_cfg (dict): dictionary of quantization configuration. op_infos_from_cfgs (dict): op infos from configs. output_tensor_ids_op_name (dict): dictionary of output tensor op names.
Return type:: ops_name (list)

neural_compressor.torch.algorithms.static_quant.utility.get_quantizable_ops_from_cfgs(ops_name, op_infos_from_cfgs, input_tensor_ids_op_name)[source]

Get quantizable ops from configs, combine fused ops as one op.

Parameters:

ops_name (list) – list of op names.
op_infos_from_cfgs (dict) – op infos from configs.
input_tensor_ids_op_name (dict) – dictionary of input tensor op names.

Returns:

cfgs (dict).

class neural_compressor.torch.algorithms.static_quant.utility.Statistics(data, header, field_names, output_handle=logger.info)[source]: The statistics printer.

class neural_compressor.torch.algorithms.static_quant.utility.TransformerBasedModelBlockPatternDetector(model: torch.nn.Module, pattern_lst: List[List[str | int]] = BLOCK_PATTERNS)[source]: Detect the attention block and FFN block in transformer-based model.

neural_compressor.torch.algorithms.static_quant.utility

Module Contents

Classes

Functions

`neural_compressor.torch.algorithms.static_quant.utility`