neural_compressor.jax.quantization.saving
Serialization helpers for JAX quantized Keras models.
Classes
Handle version metadata for serialized quantized models. |
|
Mixin for saving and loading quantized layer variables. |
|
Wrapper that preserves quantization config when saving Keras backbones. |
|
Wrapper that preserves quantization config for Keras tasks. |
|
Generic quantized model wrapper for Keras models without specific backbone or task structure. |
|
Quantized wrapper for Gemma3CausalLM models. |
|
Quantized wrapper for ViTImageClassifier models. |
|
Quantized wrapper for Gemma3Tokenizer models. |
Functions
|
Serialize a quant config to a JSON-compatible dict with class name. |
Deserialize a quant config from a JSON-compatible dict with class name. |
|
Transform a loaded quantized model. |
Module Contents
- neural_compressor.jax.quantization.saving.quant_config_to_json_object(quant_config: neural_compressor.jax.quantization.config.BaseConfig) dict[source]
Serialize a quant config to a JSON-compatible dict with class name.
- Parameters:
quant_config (BaseConfig) – The quantization config object to serialize.
- Returns:
A dict with ‘quantization_type’ and ‘config’ keys.
- Return type:
dict
- neural_compressor.jax.quantization.saving.quant_config_from_json_object(json_obj: dict) neural_compressor.jax.quantization.config.BaseConfig[source]
Deserialize a quant config from a JSON-compatible dict with class name.
- Parameters:
json_obj (dict) – A dict with ‘quantization_type’ and ‘config’ keys.
- Returns:
The instantiated quantization config object.
- Return type:
- Raises:
ValueError – If the class name is unknown.
- class neural_compressor.jax.quantization.saving.VersionManager[source]
Handle version metadata for serialized quantized models.
- class neural_compressor.jax.quantization.saving.SaveableLayerMixin[source]
Mixin for saving and loading quantized layer variables.
- class neural_compressor.jax.quantization.saving.KerasQuantizedModelBackboneWrapper(model, quant_config: neural_compressor.jax.quantization.config.BaseConfig | None = None)[source]
Wrapper that preserves quantization config when saving Keras backbones.
- class neural_compressor.jax.quantization.saving.KerasQuantizedModelWrapperMixin(model, quant_config: neural_compressor.jax.quantization.config.BaseConfig | None = None)[source]
Wrapper that preserves quantization config for Keras tasks.
- class neural_compressor.jax.quantization.saving.KerasQuantizedModelWrapper(model, quant_config: neural_compressor.jax.quantization.config.BaseConfig | None = None)[source]
Generic quantized model wrapper for Keras models without specific backbone or task structure.
- class neural_compressor.jax.quantization.saving.KerasQuantizedGemmaWrapper(model, quant_config: neural_compressor.jax.quantization.config.BaseConfig | None = None)[source]
Quantized wrapper for Gemma3CausalLM models.
- class neural_compressor.jax.quantization.saving.KerasQuantizedViTWrapper(model, quant_config: neural_compressor.jax.quantization.config.BaseConfig | None = None)[source]
Quantized wrapper for ViTImageClassifier models.
- class neural_compressor.jax.quantization.saving.KerasQuantizedTokenizerWrapper(model, quant_config: neural_compressor.jax.quantization.config.BaseConfig | None = None)[source]
Quantized wrapper for Gemma3Tokenizer models.
- neural_compressor.jax.quantization.saving.prepare_deserialized_quantized_model(model: keras.Model, quant_config: neural_compressor.jax.quantization.config.BaseConfig) KerasQuantizedModelWrapperMixin | KerasQuantizedModelBackboneWrapper[source]
Transform a loaded quantized model.
It prepares the model for inference by preparing the quantized layers. :param model: Loaded base keras model. :type model: keras.Model :param quant_config: Quantization configuration. :type quant_config: BaseConfig
- Returns:
The transformed quantized model/backbone wrapper.
- Return type:
Union[KerasQuantizedModelWrapperMixin, KerasQuantizedModelBackboneWrapper]