neural_compressor.adaptor.mxnet
¶
Module Contents¶
Classes¶
The MXNet adaptor layer, do MXNet quantization, calibration, inspect layer tensors. |
|
Base class that defines Query Interface. |
- class neural_compressor.adaptor.mxnet.MxNetAdaptor(framework_specific_info)¶
Bases:
neural_compressor.adaptor.adaptor.Adaptor
The MXNet adaptor layer, do MXNet quantization, calibration, inspect layer tensors.
- Parameters:
framework_specific_info (dict) – framework specific configuration for quantization.
- quantize(tune_cfg, nc_model, dataloader, q_func=None)¶
- The function is used to do MXNet calibration and quanitization in post-training
quantization.
- Parameters:
tune_cfg (dict) – quantization config.
nc_model (object) – neural_compressor fp32 model to be quantized.
dataloader (object) – calibration dataset.
q_func (optional) – training function for quantization aware training mode, unimplement yet for MXNet.
- Returns:
quantized model
- Return type:
- evaluate(nc_model, data_x, postprocess=None, metrics=None, measurer=None, iteration=-1, tensorboard=False, fp32_baseline=False)¶
The function is used to run evaluation on validation dataset.
- Parameters:
nc_model (object) – model to evaluate.
data_x (object) – data iterator/loader.
postprocess (object, optional) – process the result from the model
metrics (list) – list of evaluate metric.
measurer (object, optional) – for precise benchmark measurement.
iteration (int, optional) – control steps of mini-batch
tensorboard (boolean, optional) – for tensorboard inspect tensor.
fp32_baseline (boolean, optional) – only for compare_label=False pipeline
- Returns:
evaluate result.
- Return type:
acc
- query_fw_capability(nc_model)¶
Query MXNet quantization capability on the model/op level with the specific model.
- Parameters:
nc_model (object) – model to query.
- Returns:
modelwise and opwise config.
- Return type:
dict
- inspect_tensor(nc_model, data_x, op_list=[], iteration_list=[], inspect_type='activation', save_to_disk=False, save_path=None, quantization_cfg=None)¶
The function is used by tune strategy class for dumping tensor info.
- Parameters:
nc_model (object) – The model to do calibration.
data_x (object) – Data iterator/loader.
op_list (list) – list of inspect tensors.
iteration_list (list) – list of inspect iterations.
- Returns:
includes tensor dicts
- Return type:
dict
- recover_tuned_model(nc_model, q_config)¶
Execute the recover process on the specified model.
- Parameters:
tune_cfg (dict) – quantization configuration
nc_model (object) – fp32 model
q_config (dict) – recover configuration
- Returns:
the quantized model
- Return type:
- abstract set_tensor(model, tensor_dict)¶
The function is used by tune strategy class for setting tensor back to model.
- Parameters:
model (object) – The model to set tensor. Usually it is quantized model.
tensor_dict (dict) –
The tensor dict to set. Note the numpy array contains float value, adaptor layer has the responsibility to quantize to int8 or int32 to set into the quantized model if needed. The dict format is something like: {
’weight0_name’: numpy.array, ‘bias0_name’: numpy.array, …
}
- save(model, path)¶
The function is used by tune strategy class for saving model.
- Parameters:
model (object) – The model to saved.
path (string) – The path where to save.
- class neural_compressor.adaptor.mxnet.MXNetQuery(local_config_file)¶
Bases:
neural_compressor.adaptor.query.QueryBackendCapability
Base class that defines Query Interface. Each adaption layer should implement the inherited class for specific backend on their own.
- get_version()¶
Get the current backend’s version string.
- get_precisions()¶
Get the supported low precisions, e.g [‘int8’, ‘bf16’]
- get_op_types()¶
Get the op types for specific backend per low precision. e.g {‘1.6.0’: {‘int8’: [‘Conv2D’, ‘fully_connected’]}}
- get_fuse_patterns()¶
Get the fusion patterns for specified op type for every specific precision
- get_quantization_capability()¶
Get the quantization capability of low precision op types. e.g, granularity, scheme and etc.
- get_mixed_precision_combination()¶
Get the valid precision combination base on hardware and user’ config. e.g[‘fp32’, ‘bf16’, ‘int8’]