neural_compressor.adaptor.mxnet_utils.util

mxnet util module.

Module Contents

Classes

OpType

Enum op types.

DataLoaderWrap

DataLoader Wrap.

DataIterLoader

DataIterLoader.

CollectorBase

Collector Base class.

CalibCollector

Collect the calibration thresholds depending on the algorithm set.

TensorCollector

Tensors collector. Builds up qtensor_to_tensor mapping.

NameCollector

Name collector.

CalibData

Calibration data class.

Functions

isiterable(→ bool)

Checks whether object is iterable.

ensure_list(x)

Ensures that object is a list.

check_mx_version(version)

Checks MXNet version.

combine_capabilities(current, new)

Combine capabilities.

make_nc_model(target, sym_model, ctx, input_desc)

Converts a symbolic model to an Neural Compressor model.

fuse(sym_model, ctx)

Fuse the supplied model.

get_framework_name(ctx)

Get the framework name by version.

prepare_model_data(nc_model, ctx, data_x)

Prepares sym_model and dataloader needed for quantization, calibration or running.

prepare_model(nc_model, ctx, input_desc)

Prepare model.

create_data_example(ctx, input_desc)

Create data example by mxnet input description and ctx.

prepare_dataloader(nc_model, ctx, data_x)

Prepare dataloader.

ndarray_to_device(ndarray, device)

Ndarray to device.

is_model_quantized(sym_model)

Checks whether the model is quantized.

query_quantizable_nodes(sym_model, ctx, dataloader)

Query quantizable nodes of the given model.

quantize_sym_model(sym_model, ctx, qconfig)

Quantizes the symbolic model according to the configuration.

run_forward(sym_model, ctx, dataloader, b_filter[, ...])

Run forward propagation on the model.

make_symbol_block(sym_model, ctx, input_desc)

Convert a symbol model to gluon SymbolBlock.

make_module(sym_model, ctx, input_desc)

Convert a symbol model to Module.

parse_tune_config(tune_cfg, quantizable_nodes)

Convert the strategy config to MXNet quantization config.

distribute_calib_tensors(calib_tensors, calib_cfg, ...)

Distributes the tensors for calibration, depending on the algorithm set in the configuration of their nodes.

calib_model(qsym_model, calib_data, calib_cfg)

Calibrate the quantized symbol model using data gathered by the collector.

amp_convert(sym_model, input_desc, amp_cfg)

Convert model to support amp.

class neural_compressor.adaptor.mxnet_utils.util.OpType

Bases: enum.Enum

Enum op types.

neural_compressor.adaptor.mxnet_utils.util.isiterable(obj) bool

Checks whether object is iterable.

Parameters:

obj – object to check.

Returns:

True if object is iterable, else False.

Return type:

boolean

neural_compressor.adaptor.mxnet_utils.util.ensure_list(x)

Ensures that object is a list.

Parameters:

x – input.

Returns:

x if x is list, else [x].

Return type:

list

neural_compressor.adaptor.mxnet_utils.util.check_mx_version(version)

Checks MXNet version.

Parameters:

version (str) – version to check.

Returns:

True if mx.__version__ >= version, else False.

Return type:

boolean

neural_compressor.adaptor.mxnet_utils.util.combine_capabilities(current, new)

Combine capabilities.

Parameters:
  • current (dict) – current capabilities.

  • new (dict) – new capabilities.

Returns:

contains all capabilities.

Return type:

dict

neural_compressor.adaptor.mxnet_utils.util.make_nc_model(target, sym_model, ctx, input_desc)

Converts a symbolic model to an Neural Compressor model.

Parameters:
  • target (object) – target model type to return.

  • sym_model (tuple) – symbol model (symnet, args, auxs).

  • input_desc (list) – model input data description.

Returns:

converted neural_compressor model

Return type:

NCModel

neural_compressor.adaptor.mxnet_utils.util.fuse(sym_model, ctx)

Fuse the supplied model.

Parameters:

sym_model (tuple) – symbol model (symnet, args, auxs).

Returns:

fused symbol model (symnet, args, auxs).

Return type:

tuple

neural_compressor.adaptor.mxnet_utils.util.get_framework_name(ctx)

Get the framework name by version.

Parameters:

ctx (object) – mxnet context object.

Returns:

framework name.

Return type:

str

neural_compressor.adaptor.mxnet_utils.util.prepare_model_data(nc_model, ctx, data_x)

Prepares sym_model and dataloader needed for quantization, calibration or running.

Parameters:
  • nc_model (object) – model to prepare.

  • data_x (object) – data iterator/loader to prepare.

Returns:

symbol model (symnet, args, auxs) and DataLoaderWrap.

Return type:

tuple

neural_compressor.adaptor.mxnet_utils.util.prepare_model(nc_model, ctx, input_desc)

Prepare model.

Parameters:
  • nc_model (object) – model to prepare.

  • ctx (object) – mxnet context object.

  • input_desc (list) – input list of mxnet data types.

Returns:

mxnet model (symnet, args, auxs).

Return type:

object

neural_compressor.adaptor.mxnet_utils.util.create_data_example(ctx, input_desc)

Create data example by mxnet input description and ctx.

Parameters:
  • ctx (object) – mxnet context object.

  • input_desc (list) – input list of mxnet data types.

Returns:

data example.

Return type:

list

neural_compressor.adaptor.mxnet_utils.util.prepare_dataloader(nc_model, ctx, data_x)

Prepare dataloader.

Parameters:
  • nc_model (object) – model to prepare.

  • ctx (object) – mxnet context object.

  • data_x (object) – mxnet io iterable object or dataloader object.

Returns:

dataloader.

Return type:

object

neural_compressor.adaptor.mxnet_utils.util.ndarray_to_device(ndarray, device)

Ndarray to device.

Parameters:
  • ndarray (ndarray) – model to prepare.

  • device (object) – mxnet device object.

Returns:

ndarray on the device.

Return type:

ndarray

neural_compressor.adaptor.mxnet_utils.util.is_model_quantized(sym_model)

Checks whether the model is quantized.

Parameters:

sym_model (tuple) – symbol model (symnet, args, auxs).

Returns:

True if model is quantized, else False.

Return type:

boolean

neural_compressor.adaptor.mxnet_utils.util.query_quantizable_nodes(sym_model, ctx, dataloader)

Query quantizable nodes of the given model.

Parameters:

sym_model (tuple) – symbol model (symnet, args, auxs) to query.

Returns:

quantizable nodes of the given model. dict: tensor to node mapping.

Return type:

list

neural_compressor.adaptor.mxnet_utils.util.quantize_sym_model(sym_model, ctx, qconfig)

Quantizes the symbolic model according to the configuration.

Parameters:
  • sym_model (tuple) – symbol model (symnet, args, auxs).

  • qconfig (dict) – quantization configuration.

Returns:

Symbol model (symnet, args, auxs) and list of tensors for calibration.

Return type:

tuple

neural_compressor.adaptor.mxnet_utils.util.run_forward(sym_model, ctx, dataloader, b_filter, collector=None, pre_batch=None, post_batch=None)

Run forward propagation on the model.

Parameters:
  • sym_model (tuple) – symbol model (symnet, args, auxs).

  • dataloader (DataLoaderWrap) – data loader.

  • b_filter (generator) – filter on which batches to run inference on.

  • collector (object) – collects information during inference.

  • pre_batch – function to call prior to batch inference.

  • post_batch – function to call after batch inference.

Returns:

batch count.

Return type:

int

neural_compressor.adaptor.mxnet_utils.util.make_symbol_block(sym_model, ctx, input_desc)

Convert a symbol model to gluon SymbolBlock.

Parameters:
  • sym_model (tuple) – symbol model (symnet, args, auxs).

  • input_desc (list) – model input data description.

Returns:

SymbolBlock model.

Return type:

mx.gluon.SymbolBlock

neural_compressor.adaptor.mxnet_utils.util.make_module(sym_model, ctx, input_desc)

Convert a symbol model to Module.

Parameters:
  • sym_model (tuple) – symbol model (symnet, args, auxs).

  • input_desc (list) – model input data description.

Returns:

Module model.

Return type:

mx.module.Module

neural_compressor.adaptor.mxnet_utils.util.parse_tune_config(tune_cfg, quantizable_nodes)

Convert the strategy config to MXNet quantization config.

Parameters:
  • tune_cfg (dict) – tune config from neural_compressor strategy.

  • quantizable_nodes (list) – quantizable nodes in the model.

Returns:

quantization configuration. dict: calibration configuration.

Return type:

dict

neural_compressor.adaptor.mxnet_utils.util.distribute_calib_tensors(calib_tensors, calib_cfg, tensor_to_node)

Distributes the tensors for calibration, depending on the algorithm set in the configuration of their nodes.

Parameters:
  • calib_tensors – tensors to distribute.

  • calib_cfg (dict) – calibration configuration.

  • tensor_to_node (dict) – tensor to node mapping.

Returns:

kl tensors and minmax tensors.

Return type:

tuple

neural_compressor.adaptor.mxnet_utils.util.calib_model(qsym_model, calib_data, calib_cfg)

Calibrate the quantized symbol model using data gathered by the collector.

Parameters:
  • qsym_model (tuple) – quantized symbol model (symnet, args, auxs).

  • calib_data (CalibData) – data needed for calibration (thresholds).

  • calib_cfg (dict) – calibration configuration.

Returns:

quantized calibrated symbol model (symnet, args, auxs).

Return type:

tuple

neural_compressor.adaptor.mxnet_utils.util.amp_convert(sym_model, input_desc, amp_cfg)

Convert model to support amp.

class neural_compressor.adaptor.mxnet_utils.util.DataLoaderWrap(dataloader, input_desc)

DataLoader Wrap.

class neural_compressor.adaptor.mxnet_utils.util.DataIterLoader(data_iter)

DataIterLoader.

class neural_compressor.adaptor.mxnet_utils.util.CollectorBase

Collector Base class.

abstract collect_gluon(name, _, arr)

Collect by gluon api.

collect_module(name, arr)

Collect by module name.

pre_batch(m, b)

Function to call prior to batch inference.

post_batch(m, b, o)

Function to call after batch inference.

class neural_compressor.adaptor.mxnet_utils.util.CalibCollector(include_tensors_kl, include_tensors_minmax, num_bins=8001)

Bases: CollectorBase

Collect the calibration thresholds depending on the algorithm set.

collect_gluon(name, _, arr)

Collect by gluon api.

calc_kl_th_dict(quantized_dtype)

Calculation kl thresholds.

class neural_compressor.adaptor.mxnet_utils.util.TensorCollector(include_nodes, qtensor_to_tensor, tensor_to_node)

Bases: CollectorBase

Tensors collector. Builds up qtensor_to_tensor mapping.

collect_gluon(name, _, arr)

Collect by gluon api.

pre_batch(m, b)

Preprocess.

class neural_compressor.adaptor.mxnet_utils.util.NameCollector

Bases: CollectorBase

Name collector.

collect_gluon(name, _, arr)

Collect by gluon api.

class neural_compressor.adaptor.mxnet_utils.util.CalibData(cache_kl={}, cache_minmax={}, tensors_kl=[], tensors_minmax=[])

Calibration data class.

property min_max_dict

Return mix-max dict.

post_collect()

Return mix-max dict for mxnet version >= 2.0.0.