neural_compressor.jax.quantization.quantize

Intel Neural Compressor JAX quantization base API.

Functions

need_apply(configs_mapping, algo_name)

Determine whether a quantization algorithm should be applied.

quantize_model(model, quant_config[, calib_function, ...])

Return a quantized Keras model according to the given configuration.

Module Contents

neural_compressor.jax.quantization.quantize.need_apply(configs_mapping: Dict[Tuple[str, callable], neural_compressor.common.base_config.BaseConfig], algo_name)[source]

Determine whether a quantization algorithm should be applied.

Parameters:
  • configs_mapping (Dict[Tuple[str, callable], BaseConfig]) – Mapping of layer identifiers to configs.

  • algo_name (str) – Algorithm name to check.

Returns:

True if any config matches the algorithm name.

Return type:

bool

neural_compressor.jax.quantization.quantize.quantize_model(model: keras.Model, quant_config: neural_compressor.common.base_config.BaseConfig, calib_function: Callable = None, inplace: bool = True)[source]

Return a quantized Keras model according to the given configuration.

Parameters:
  • model (keras.Model) – FP32 Keras model to be quantized.

  • quant_config (BaseConfig) – Quantization configuration.

  • calib_function (Callable, optional) – Function used for model calibration, required for static quantization.

  • inplace (bool) – When True, the original model is modified in-place and should not be used afterward. A value of False is not yet supported.

Returns:

The quantized model.

Return type:

keras.Model