neural_compressor.torch.algorithms.layer_wise.utils

Utils for layer wise quantization.

Module Contents

Functions

get_module(model, key)

Get module from model by key name.

get_children(model)

Get all the children of given model.

get_named_children(model[, pre])

Get all the name and children of given model.

dowload_hf_model(repo_id[, cache_dir, repo_type, revision])

Download hugging face model from hf hub.

load_empty_model(pretrained_model_name_or_path[, cls])

Load a empty model.

get_super_module_by_name(model, module_name)

Get the father module with given name of child module.

update_module(model, module_name, new_module)

Update module.

load_layer_wise_quantized_model(path)

Load layer wise quantized model.

load_tensor_from_shard(pretrained_model_name_or_path, ...)

Load tensor from shard.

load_tensor(path[, tensor_name, prefix])

Load a tensor from bin file with given tensor name.

neural_compressor.torch.algorithms.layer_wise.utils.get_module(model, key)[source]

Get module from model by key name.

Parameters:
  • model (torch.nn.Module) – original model

  • key (str) – module name to be replaced

neural_compressor.torch.algorithms.layer_wise.utils.get_children(model)[source]

Get all the children of given model.

neural_compressor.torch.algorithms.layer_wise.utils.get_named_children(model, pre=[])[source]

Get all the name and children of given model.

neural_compressor.torch.algorithms.layer_wise.utils.dowload_hf_model(repo_id, cache_dir=None, repo_type=None, revision=None)[source]

Download hugging face model from hf hub.

neural_compressor.torch.algorithms.layer_wise.utils.load_empty_model(pretrained_model_name_or_path, cls=AutoModelForCausalLM, **kwargs)[source]

Load a empty model.

neural_compressor.torch.algorithms.layer_wise.utils.get_super_module_by_name(model, module_name)[source]

Get the father module with given name of child module.

neural_compressor.torch.algorithms.layer_wise.utils.update_module(model, module_name, new_module)[source]

Update module.

neural_compressor.torch.algorithms.layer_wise.utils.load_layer_wise_quantized_model(path)[source]

Load layer wise quantized model.

neural_compressor.torch.algorithms.layer_wise.utils.load_tensor_from_shard(pretrained_model_name_or_path, tensor_name, prefix=None)[source]

Load tensor from shard.

neural_compressor.torch.algorithms.layer_wise.utils.load_tensor(path, tensor_name=None, prefix=None)[source]

Load a tensor from bin file with given tensor name.