neural_compressor.adaptor.adaptor

Module Contents

Classes

Adaptor

The base class of framework adaptor layer.

Functions

adaptor_registry(cls)

The class decorator used to register all Adaptor subclasses.

neural_compressor.adaptor.adaptor.adaptor_registry(cls)

The class decorator used to register all Adaptor subclasses.

Parameters:

cls (class) – The class of register.

class neural_compressor.adaptor.adaptor.Adaptor(framework_specific_info)

Bases: object

The base class of framework adaptor layer.

abstract quantize(tune_cfg, model, dataloader, q_func=None)

The function is used to do calibration and quanitization in post-training quantization.

Parameters:
  • tune_cfg (dict) – The chosen tuning configuration.

  • model (object) – The model to do calibration.

  • dataloader (object) – The dataloader used to load calibration dataset.

  • q_func (optional) – training function for quantization aware training mode.

abstract evaluate(model, dataloader, postprocess=None, metric=None, measurer=None, iteration=-1, tensorboard=False)

The function is used to run evaluation on validation dataset.

Parameters:
  • model (object) – The model to do calibration.

  • dataloader (generator) – generate the data and labels.

  • postprocess (object, optional) – process the result from the model

  • metric (object, optional) – Depends on model category. Defaults to None.

  • measurer (object, optional) – for precise benchmark measurement.

  • iteration (int, optional) – control steps of mini-batch

  • tensorboard (boolean, optional) – for tensorboard inspect tensor.

abstract query_fw_capability(model)

The function is used to return framework tuning capability.

Parameters:

model (object) – The model to query quantization tuning capability.

abstract query_fused_patterns(model)

The function is used to run fused patterns in framework.

Parameters:

model (object) – The model to do calibration.

Returns:

[[‘conv’, ‘relu’], [‘conv’, ‘relu’, ‘bn’]]

abstract inspect_tensor(model, dataloader, op_list=[], iteration_list=[], inspect_type='activation', save_to_disk=False)

The function is used by tune strategy class for dumping tensor info.

Parameters:
  • model (object) – The model to inspect.

  • dataloader (object) – The dataloader used to feed into.

  • op_list (list) – The op name in the fp32 model for dumpping.

  • iteration_list (list) – The iteration list containing iterations to dump.

  • inspect_type (str) – The valid value are ‘weight’, ‘activation’, ‘all’.

  • save_to_disk (bool) – Save to disk or memory.

Returns:

Numpy Array Dict {

’weight’: {

‘node0_name’: {‘weight0_name’: numpy.array, ‘bias0_name’: numpy.array, …}, ‘node1_name’: {‘weight1_name’: numpy.array, ‘bias1_name’: numpy.array, …}, …

}, ‘activation’: [

# iter 0 {

’node0_name’: {‘output0_name’: numpy.array, ‘output1_name’: numpy.array, …} ‘node1_name’: {‘output1_name’: numpy.array, ‘output1_name’: numpy.array, …} …

}, # iter 1 …

]

}

abstract set_tensor(model, tensor_dict)

The function is used by tune strategy class for setting tensor back to model.

Parameters:
  • model (object) – The model to set tensor. Usually it is quantized model.

  • tensor_dict (dict) –

    The tensor dict to set. Note the numpy array contains float value, adaptor layer has the responsibility to quantize to int8 or int32 to set into the quantized model if needed. The dict format is something like: {

    ’weight0_name’: numpy.array, ‘bias0_name’: numpy.array, …

    }

quantize_input(model)

quantize the model to be able to take quantized input

Parameters:

model (object) – The model to quantize input

Returns:

The quantized input model scale (float): The scale for dataloader to generate quantized input

Return type:

model (object)

abstract save(model, path)

The function is used by tune strategy class for saving model.

Parameters:
  • model (object) – The model to saved.

  • path (string) – The path where to save.

abstract convert(model, source, destinatin)

The function is used to convert a source model format to another.

Parameters:
  • model (neural_compressor.model) – base model to be converted.

  • source (string) – The source model format.

  • destination (string) – The destination model format.