neural_compressor.adaptor.mxnet

Module Contents

Classes

MxNetAdaptor

The MXNet adaptor layer, do MXNet quantization, calibration, inspect layer tensors.

MXNetQuery

Base class that defines Query Interface.

class neural_compressor.adaptor.mxnet.MxNetAdaptor(framework_specific_info)

Bases: neural_compressor.adaptor.adaptor.Adaptor

The MXNet adaptor layer, do MXNet quantization, calibration, inspect layer tensors.

Parameters:

framework_specific_info (dict) – framework specific configuration for quantization.

quantize(tune_cfg, nc_model, dataloader, q_func=None)
The function is used to do MXNet calibration and quanitization in post-training

quantization.

Parameters:
  • tune_cfg (dict) – quantization config.

  • nc_model (object) – neural_compressor fp32 model to be quantized.

  • dataloader (object) – calibration dataset.

  • q_func (optional) – training function for quantization aware training mode, unimplement yet for MXNet.

Returns:

quantized model

Return type:

(MXNetModel)

evaluate(nc_model, data_x, postprocess=None, metrics=None, measurer=None, iteration=-1, tensorboard=False, fp32_baseline=False)

The function is used to run evaluation on validation dataset.

Parameters:
  • nc_model (object) – model to evaluate.

  • data_x (object) – data iterator/loader.

  • postprocess (object, optional) – process the result from the model

  • metrics (list) – list of evaluate metric.

  • measurer (object, optional) – for precise benchmark measurement.

  • iteration (int, optional) – control steps of mini-batch

  • tensorboard (boolean, optional) – for tensorboard inspect tensor.

  • fp32_baseline (boolean, optional) – only for compare_label=False pipeline

Returns:

evaluate result.

Return type:

acc

query_fw_capability(nc_model)

Query MXNet quantization capability on the model/op level with the specific model.

Parameters:

nc_model (object) – model to query.

Returns:

modelwise and opwise config.

Return type:

dict

inspect_tensor(nc_model, data_x, op_list=[], iteration_list=[], inspect_type='activation', save_to_disk=False, save_path=None, quantization_cfg=None)

The function is used by tune strategy class for dumping tensor info.

Parameters:
  • nc_model (object) – The model to do calibration.

  • data_x (object) – Data iterator/loader.

  • op_list (list) – list of inspect tensors.

  • iteration_list (list) – list of inspect iterations.

Returns:

includes tensor dicts

Return type:

dict

recover_tuned_model(nc_model, q_config)

Execute the recover process on the specified model.

Parameters:
  • tune_cfg (dict) – quantization configuration

  • nc_model (object) – fp32 model

  • q_config (dict) – recover configuration

Returns:

the quantized model

Return type:

MXNetModel

abstract set_tensor(model, tensor_dict)

The function is used by tune strategy class for setting tensor back to model.

Parameters:
  • model (object) – The model to set tensor. Usually it is quantized model.

  • tensor_dict (dict) –

    The tensor dict to set. Note the numpy array contains float value, adaptor layer has the responsibility to quantize to int8 or int32 to set into the quantized model if needed. The dict format is something like: {

    ’weight0_name’: numpy.array, ‘bias0_name’: numpy.array, …

    }

save(model, path)

The function is used by tune strategy class for saving model.

Parameters:
  • model (object) – The model to saved.

  • path (string) – The path where to save.

class neural_compressor.adaptor.mxnet.MXNetQuery(local_config_file)

Bases: neural_compressor.adaptor.query.QueryBackendCapability

Base class that defines Query Interface. Each adaption layer should implement the inherited class for specific backend on their own.

get_version()

Get the current backend’s version string.

get_precisions()

Get the supported low precisions, e.g [‘int8’, ‘bf16’]

get_op_types()

Get the op types for specific backend per low precision. e.g {‘1.6.0’: {‘int8’: [‘Conv2D’, ‘fully_connected’]}}

get_fuse_patterns()

Get the fusion patterns for specified op type for every specific precision

get_quantization_capability()

Get the quantization capability of low precision op types. e.g, granularity, scheme and etc.

get_mixed_precision_combination()

Get the valid precision combination base on hardware and user’ config. e.g[‘fp32’, ‘bf16’, ‘int8’]