neural_compressor.adaptor.keras

Module Contents

Classes

KerasAdaptor

The keras class of framework adaptor layer.

KerasQuery

Base class that defines Query Interface.

class neural_compressor.adaptor.keras.KerasAdaptor(framework_specific_info)

Bases: neural_compressor.adaptor.adaptor.Adaptor

The keras class of framework adaptor layer.

quantize(tune_cfg, model, dataloader, q_func=None)

Execute the quantize process on the specified model.

Parameters:
  • tune_cfg (dict) – The chosen tuning configuration.

  • model (object) – The model to do quantization.

  • dataloader (object) – The dataloader used to load quantization dataset.

  • q_func (optional) – training function for quantization aware training mode.

evaluate(model, dataloader, postprocess=None, metrics=None, measurer=None, iteration=-1, tensorboard=False, fp32_baseline=False)

The function is used to run evaluation on validation dataset.

Parameters:
  • model (object) – The model to do calibration.

  • dataloader (generator) – generate the data and labels.

  • postprocess (object, optional) – process the result from the model

  • metric (object, optional) – Depends on model category. Defaults to None.

  • measurer (object, optional) – for precise benchmark measurement.

  • iteration (int, optional) – control steps of mini-batch

  • tensorboard (boolean, optional) – for tensorboard inspect tensor.

  • fp32_baseline (boolen, optional) – only for compare_label=False pipeline

query_fw_capability(model)

The function is used to return framework tuning capability.

Parameters:

model (object) – The model to query quantization tuning capability.

get_optype_wise_ability(quantizable_op_details)

Get the op type wise capability by generating the union value of each op type. :returns:

the key is op type while the value is the

detail configurations of activation and weight for this op type.

Return type:

[string dict]

inspect_tensor(model, dataloader, op_list=[], iteration_list=[], inspect_type='activation', save_to_disk=False)

The function is used by tune strategy class for dumping tensor info.

Parameters:
  • model (object) – The model to inspect.

  • dataloader (object) – The dataloader used to feed into.

  • op_list (list) – The op name in the fp32 model for dumpping.

  • iteration_list (list) – The iteration list containing iterations to dump.

  • inspect_type (str) – The valid value are ‘weight’, ‘activation’, ‘all’.

  • save_to_disk (bool) – Save to disk or memory.

Returns:

Numpy Array Dict {

’weight’: {

‘node0_name’: {‘weight0_name’: numpy.array, ‘bias0_name’: numpy.array, …}, ‘node1_name’: {‘weight1_name’: numpy.array, ‘bias1_name’: numpy.array, …}, …

}, ‘activation’: [

# iter 0 {

’node0_name’: {‘output0_name’: numpy.array, ‘output1_name’: numpy.array, …} ‘node1_name’: {‘output1_name’: numpy.array, ‘output1_name’: numpy.array, …} …

}, # iter 1 …

]

}

set_tensor(model, tensor_dict)

The function is used by tune strategy class for setting tensor back to model.

Parameters:
  • model (object) – The model to set tensor. Usually it is quantized model.

  • tensor_dict (dict) –

    The tensor dict to set. Note the numpy array contains float value, adaptor layer has the responsibility to quantize to int8 or int32 to set into the quantized model if needed. The dict format is something like: {

    ’weight0_name’: numpy.array, ‘bias0_name’: numpy.array, …

    }

quantize_input(model)

quantize the model to be able to take quantized input

Parameters:

model (object) – The model to quantize input

Returns:

The quantized input model scale (float): The scale for dataloader to generate quantized input

Return type:

model (object)

save(model, path)

The function is used by tune strategy class for saving model.

Parameters:
  • model (object) – The model to saved.

  • path (string) – The path where to save.

convert(model, source, destinatin)

The function is used to convert a source model format to another.

Parameters:
  • model (neural_compressor.model) – base model to be converted.

  • source (string) – The source model format.

  • destination (string) – The destination model format.

class neural_compressor.adaptor.keras.KerasQuery(local_config_file=None)

Bases: neural_compressor.adaptor.query.QueryBackendCapability

Base class that defines Query Interface. Each adaption layer should implement the inherited class for specific backend on their own.

get_version()

Get the current backend version infomation.

Returns:

version string.

Return type:

[string]

get_precisions()

Get supported precisions for current backend.

Returns:

the precisions’ name.

Return type:

[string list]

get_op_types()

Get the supported op types by all precisions.

Returns:

A list composed of dictionary which key is precision and value is the op types.

Return type:

[dictionary list]

get_quantization_capability()

Get the supported op types’ quantization capability.

Returns:

A list composed of dictionary which key is precision and value is a dict that describes all op types’ quantization capability.

Return type:

[dictionary list]

get_op_types_by_precision(precision)

Get op types per precision

Parameters:

precision (string) – precision name

Returns:

A list composed of op type.

Return type:

[string list]