cnn_dailymail

Module Contents

Classes

DataCollatorForSupervisedDataset

Collate examples for supervised fine-tuning.

Functions

preprocess(→ Dict)

Preprocess the data by tokenizing.

class cnn_dailymail.DataCollatorForSupervisedDataset(tokenizer: transformers.PreTrainedTokenizer)[source]

Collate examples for supervised fine-tuning.

cnn_dailymail.preprocess(sources: Sequence[str], targets: Sequence[str], tokenizer: transformers.PreTrainedTokenizer) Dict[source]

Preprocess the data by tokenizing.