neural_compressor.onnxrt.utils.utility
Module Contents
Functions
|
Helper function to find item by name in a list. |
|
Progress bar for cases where tqdm can't be used. |
|
Decorator function to register algorithms in the algos_mapping dictionary. |
|
|
|
Whether inuput B is transposed. |
|
Helper function to get the quantization range for a type. |
|
Quantize data. |
Check if the model has been shape inferred. |
Attributes
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- neural_compressor.onnxrt.utils.utility.find_by_name(name, item_list)[source]
Helper function to find item by name in a list.
- neural_compressor.onnxrt.utils.utility.simple_progress_bar(total, i)[source]
Progress bar for cases where tqdm can’t be used.
- neural_compressor.onnxrt.utils.utility.register_algo(name)[source]
Decorator function to register algorithms in the algos_mapping dictionary.
- Usage example:
@register_algo(name=example_algo) def example_algo(model: Union[onnx.ModelProto, Path, str],
quant_config: RTNConfig) -> onnx.ModelProto:
…
- Parameters:
name (str) – The name under which the algorithm function will be registered.
- Returns:
The decorator function to be used with algorithm functions.
- Return type:
decorator
- neural_compressor.onnxrt.utils.utility.is_B_transposed(node)[source]
Whether inuput B is transposed.
- neural_compressor.onnxrt.utils.utility.get_qrange_for_qType(qType, reduce_range=False)[source]
Helper function to get the quantization range for a type.
- Parameters:
qType (int) – data type
reduce_range (bool, optional) – use 7 bit or not. Defaults to False.
- neural_compressor.onnxrt.utils.utility.quantize_data(data, quantize_range, qType, scheme)[source]
Quantize data.
- To pack weights, we compute a linear transformation
when data type == uint8 mode, from [rmin, rmax] -> [0, 2^{b-1}] and
- when data type == int8, from [-m , m] -> [-(2^{b-1}-1), 2^{b-1}-1] where
m = max(abs(rmin), abs(rmax))
and add necessary intermediate nodes to transform quantized weight to full weight using the equation r = S(q-z), where
r: real original value q: quantized value S: scale z: zero point
- Parameters:
data (array) – data to quantize
quantize_range (list) – list of data to weight pack.
qType (int) – data type to quantize to. Supported types UINT8 and INT8
scheme (string) – sym or asym quantization.