neural_compressor.utils.load_huggingface
¶
Huggingface Loader: provides access to Huggingface pretrained models.
Module Contents¶
Classes¶
The class provides a method from_pretrained to access Huggingface models. |
Functions¶
|
Save the model and tokenizer in the output directory. |
- class neural_compressor.utils.load_huggingface.OptimizedModel(*args, **kwargs)¶
The class provides a method from_pretrained to access Huggingface models.
- classmethod from_pretrained(model_name_or_path: str, **kwargs) torch.nn.Module ¶
Instantiate a quantized pytorch model from a given Intel Neural Compressor (INC) configuration file.
- Parameters:
model_name_or_path (
str
) – Repository name in the Hugging Face Hub or path to a local directory hosting the model.cache_dir (
str
, optional) – Path to a directory in which a downloaded configuration should be cached if the standard cache should not be used.force_download (
bool
, optional, defaults toFalse
) – Whether or not to force to (re-)download the configuration files and override the cached versions if they exist.resume_download (
bool
, optional, defaults toFalse
) – Whether or not to delete incompletely received file. Attempts to resume the download if such a file exists.revision (
str
, optional) – The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.
- Returns:
Quantized model.
- Return type:
q_model
- neural_compressor.utils.load_huggingface.save_for_huggingface_upstream(model, tokenizer, output_dir)¶
Save the model and tokenizer in the output directory.