neural_compressor.jax.quantization.layers_static
Static quantized layer implementations for JAX-backed Keras models.
Classes
Observer that tracks running min/max values for calibration. |
|
Layer that applies static quantize-dequantize to activations. |
|
Mixin that adds static quantization to dense-like layers. |
|
Statically quantized Dense layer. |
|
Statically quantized EinsumDense layer. |
|
Statically quantized MultiHeadAttention layer. |
|
Statically quantized CachedGemma3Attention layer. |
|
Statically quantized Gemma3VisionAttention layer. |
|
Statically quantized RotaryEmbedding layer. |
|
Statically quantized ReversibleEmbedding layer. |
Functions
Register quantized layer class for an original layer class. |
Module Contents
- neural_compressor.jax.quantization.layers_static.register_static_quantized_layer(clso)[source]
Register quantized layer class for an original layer class.
- Parameters:
clso (type) – Original layer class to map to a quantized implementation.
- Returns:
Decorator that registers the quantized class.
- Return type:
Callable
- class neural_compressor.jax.quantization.layers_static.MinMaxObserver(*args, **kwargs)[source]
Observer that tracks running min/max values for calibration.
- class neural_compressor.jax.quantization.layers_static.StaticQDQLayer(name, activation_dtype, asymmetric=False)[source]
Layer that applies static quantize-dequantize to activations.
- class neural_compressor.jax.quantization.layers_static.QStaticDenseMixin[source]
Mixin that adds static quantization to dense-like layers.
- class neural_compressor.jax.quantization.layers_static.QStaticDense[source]
Statically quantized Dense layer.
- class neural_compressor.jax.quantization.layers_static.QStaticEinsumDense[source]
Statically quantized EinsumDense layer.
- class neural_compressor.jax.quantization.layers_static.QStaticMultiHeadAttention[source]
Statically quantized MultiHeadAttention layer.
- class neural_compressor.jax.quantization.layers_static.QStaticCachedGemma3Attention[source]
Statically quantized CachedGemma3Attention layer.
- class neural_compressor.jax.quantization.layers_static.QStaticGemma3VisionAttention[source]
Statically quantized Gemma3VisionAttention layer.