neural_compressor.jax.quantization.quantize
Intel Neural Compressor JAX quantization base API.
Functions
|
Determine whether a quantization algorithm should be applied. |
|
Return a quantized Keras model according to the given configuration. |
Module Contents
- neural_compressor.jax.quantization.quantize.need_apply(configs_mapping: Dict[Tuple[str, callable], neural_compressor.common.base_config.BaseConfig], algo_name)[source]
Determine whether a quantization algorithm should be applied.
- Parameters:
configs_mapping (Dict[Tuple[str, callable], BaseConfig]) – Mapping of layer identifiers to configs.
algo_name (str) – Algorithm name to check.
- Returns:
True if any config matches the algorithm name.
- Return type:
bool
- neural_compressor.jax.quantization.quantize.quantize_model(model: keras.Model, quant_config: neural_compressor.common.base_config.BaseConfig, calib_function: Callable = None, inplace: bool = True)[source]
Return a quantized Keras model according to the given configuration.
- Parameters:
model (keras.Model) – FP32 Keras model to be quantized.
quant_config (BaseConfig) – Quantization configuration.
calib_function (Callable, optional) – Function used for model calibration, required for static quantization.
inplace (bool) – When True, the original model is modified in-place and should not be used afterward. A value of False is not yet supported.
- Returns:
The quantized model.
- Return type:
keras.Model