tlt.models.text_generation.pytorch_hf_text_generation_model.PyTorchHFTextGenerationModel.train

PyTorchHFTextGenerationModel.train(dataset, output_dir: str, epochs: int = 1, initial_checkpoints=None, temperature=1.0, lora_rank=8, lora_alpha=32, lora_dropout=0.05, max_train_samples=None, do_eval: bool = True, device: str = 'cpu', ipex_optimize: bool = True, use_trainer: bool = True, force_download: bool = False, enable_auto_mixed_precision: Optional[bool] = None, **kwargs)[source]

Trains the model using the specified text generation dataset.

Parameters
  • dataset (TextGenerationDataset) – The dataset to use for training. If a train subset has been defined, that subset will be used to fit the model. Otherwise, the entire non-partitioned dataset will be used.

  • output_dir (str) – A writeable output directory to write checkpoint files during training

  • epochs (int) – The number of training epochs [default: 1]

  • initial_checkpoints (str) – Path to checkpoint weights to load. If the path provided is a directory, the latest checkpoint will be used.

  • temperature (float) – The value used to modulate the next token probabilities [default: 1.0]

  • lora_rank (int) – LoRA rank parameter [default: 8]

  • lora_alpha (int) – LoRA alpha parameter [default: 32]

  • lora_dropout (float) – LoRA dropout parameter [default: 0.05]

  • max_train_samples (int or None) – Use this to truncate the training set to a maximum number of samples for quick testing [default: None]

  • do_eval (bool) – If do_eval is True and the dataset has a validation subset, the model will be evaluated at the end of each epoch. If the dataset does not have a validation split, the test subset will be used.

  • device (str) – Device to train the model. Defaults to “cpu”

  • ipex_optimize (bool) – Optimize the model using Intel® Extension for PyTorch. Defaults to True

  • use_trainer (bool) – Placeholder argument, model training is done using the Hugging Face Trainer and a native PyTorch training loop is not yet implemented.

  • force_download (bool) – Downloads the model with default parameters. Defaults to False.

  • enable_auto_mixed_precision (bool or None) – Enable auto mixed precision for training. Mixed precision uses both 16-bit and 32-bit floating point types to make training run faster and use less memory. It is recommended to enable auto mixed precision training when running on platforms that support bfloat16 (Intel third or fourth generation Xeon processors). If it is enabled on a platform that does not support bfloat16, it can be detrimental to the training performance. If enable_auto_mixed_precision is set to None, auto mixed precision will be automatically enabled when running with Intel fourth generation Xeon processors, and disabled for other platforms. Defaults to None.

Returns

Hugging Face TrainOutput object

Raises
  • TypeError – if the dataset specified is not a TextGenerationDataset

  • ValueError – if the given dataset has not been preprocessed yet