neural_compressor.jax.quantization.quantize

Intel Neural Compressor JAX quantization base API.

Functions

`need_apply`(configs_mapping, algo_name)	Determine whether a quantization algorithm should be applied.
`quantize_model`(model, quant_config[, calib_function, ...])	Return a quantized Keras model according to the given configuration.

neural_compressor.jax.quantization.quantize.need_apply(configs_mapping: Dict[Tuple[str, callable], neural_compressor.common.base_config.BaseConfig], algo_name)[source]

Determine whether a quantization algorithm should be applied.

Parameters:

configs_mapping (Dict[Tuple[str, callable], BaseConfig]) – Mapping of layer identifiers to configs.
algo_name (str) – Algorithm name to check.

Returns:

True if any config matches the algorithm name.

Return type:

bool

neural_compressor.jax.quantization.quantize.quantize_model(model: keras.Model, quant_config: neural_compressor.common.base_config.BaseConfig, calib_function: Callable = None, inplace: bool = True)[source]

Return a quantized Keras model according to the given configuration.

Parameters:

model (keras.Model) – FP32 Keras model to be quantized.
quant_config (BaseConfig) – Quantization configuration.
calib_function (Callable, optional) – Function used for model calibration, required for static quantization.
inplace (bool) – When True, the original model is modified in-place and should not be used afterward. False creates a copy of original model

Returns:

The quantized model.

Return type:

keras.Model