`neural_compressor.onnxrt.utils.utility`

Module Contents

`find_by_name`(name, item_list)	Helper function to find item by name in a list.
`simple_progress_bar`(total, i)	Progress bar for cases where tqdm can't be used.
`register_algo`(name)	Decorator function to register algorithms in the algos_mapping dictionary.
`get_model_info`(→ List[Tuple[str, Callable]])
`is_B_transposed`(node)	Whether inuput B is transposed.
`get_qrange_for_qType`(qType[, reduce_range])	Helper function to get the quantization range for a type.
`quantize_data`(data, quantize_range, qType, scheme)	Quantize data.
`check_model_with_infer_shapes`(model)	Check if the model has been shape inferred.

neural_compressor.onnxrt.utils.utility.find_by_name(name, item_list)[source]: Helper function to find item by name in a list.

neural_compressor.onnxrt.utils.utility.simple_progress_bar(total, i)[source]: Progress bar for cases where tqdm can’t be used.

neural_compressor.onnxrt.utils.utility.register_algo(name)[source]

Decorator function to register algorithms in the algos_mapping dictionary.

Usage example:: @register_algo(name=example_algo) def example_algo(model: Union[onnx.ModelProto, Path, str],

quant_config: RTNConfig) -> onnx.ModelProto:

…

Parameters:: name (str) – The name under which the algorithm function will be registered.
Returns:: The decorator function to be used with algorithm functions.
Return type:: decorator

neural_compressor.onnxrt.utils.utility.is_B_transposed(node)[source]: Whether inuput B is transposed.

neural_compressor.onnxrt.utils.utility.get_qrange_for_qType(qType, reduce_range=False)[source]

Helper function to get the quantization range for a type.

Parameters:

neural_compressor.onnxrt.utils.utility.quantize_data(data, quantize_range, qType, scheme)[source]

Quantize data.

To pack weights, we compute a linear transformation

when data type == uint8 mode, from [rmin, rmax] -> [0, 2^{b-1}] and
when data type == int8, from [-m , m] -> [-(2^{b-1}-1), 2^{b-1}-1] where
m = max(abs(rmin), abs(rmax))

and add necessary intermediate nodes to transform quantized weight to full weight using the equation r = S(q-z), where

r: real original value q: quantized value S: scale z: zero point

Parameters:

neural_compressor.onnxrt.utils.utility.check_model_with_infer_shapes(model)[source]: Check if the model has been shape inferred.