`neural_compressor.torch.algorithms.layer_wise.utils`

Utils for layer wise quantization.

Module Contents

`get_module`(model, key)	Get module from model by key name.
`get_children`(model)	Get all the children of given model.
`get_named_children`(model[, pre])	Get all the name and children of given model.
`dowload_hf_model`(repo_id[, cache_dir, repo_type, revision])	Download hugging face model from hf hub.
`load_empty_model`(pretrained_model_name_or_path[, cls])	Load a empty model.
`get_super_module_by_name`(model, module_name)	Get the father module with given name of child module.
`update_module`(model, module_name, new_module)	Update module.
`load_layer_wise_quantized_model`(path)	Load layer wise quantized model.
`load_tensor_from_shard`(pretrained_model_name_or_path, ...)	Load tensor from shard.
`load_tensor`(path[, tensor_name, prefix])	Load a tensor from bin file with given tensor name.

neural_compressor.torch.algorithms.layer_wise.utils.get_module(model, key)[source]

Get module from model by key name.

Parameters:

neural_compressor.torch.algorithms.layer_wise.utils.get_children(model)[source]: Get all the children of given model.

neural_compressor.torch.algorithms.layer_wise.utils.get_named_children(model, pre=[])[source]: Get all the name and children of given model.

neural_compressor.torch.algorithms.layer_wise.utils.dowload_hf_model(repo_id, cache_dir=None, repo_type=None, revision=None)[source]: Download hugging face model from hf hub.

neural_compressor.torch.algorithms.layer_wise.utils.load_empty_model(pretrained_model_name_or_path, cls=AutoModelForCausalLM, **kwargs)[source]: Load a empty model.

neural_compressor.torch.algorithms.layer_wise.utils.get_super_module_by_name(model, module_name)[source]: Get the father module with given name of child module.

neural_compressor.torch.algorithms.layer_wise.utils.update_module(model, module_name, new_module)[source]: Update module.

neural_compressor.torch.algorithms.layer_wise.utils.load_layer_wise_quantized_model(path)[source]: Load layer wise quantized model.

neural_compressor.torch.algorithms.layer_wise.utils.load_tensor_from_shard(pretrained_model_name_or_path, tensor_name, prefix=None)[source]: Load tensor from shard.

neural_compressor.torch.algorithms.layer_wise.utils.load_tensor(path, tensor_name=None, prefix=None)[source]: Load a tensor from bin file with given tensor name.