neural_compressor.adaptor.torch_utils.model_wrapper

Torch.nn.Module Class Definition.

Module Contents

Classes

FakeAffineTensorQuantFunction

Fake version of affine quantization.

TEQLinearFakeQuant

Wrapper quantization linear.

MulLinear

Linear wrapper to apply scale to input.

class neural_compressor.adaptor.torch_utils.model_wrapper.FakeAffineTensorQuantFunction[source]

Fake version of affine quantization.

class neural_compressor.adaptor.torch_utils.model_wrapper.TEQLinearFakeQuant(orig_layer, alpha=None, num_bits=4, group_size=-1, scheme='asym')[source]

Wrapper quantization linear.

class neural_compressor.adaptor.torch_utils.model_wrapper.MulLinear(module, input_scale=None)[source]

Linear wrapper to apply scale to input.