tlt.datasets.text_classification.tfds_text_classification_dataset.TFDSTextClassificationDataset¶
- class tlt.datasets.text_classification.tfds_text_classification_dataset.TFDSTextClassificationDataset(dataset_dir, dataset_name, split=['train'], shuffle_files=True, **kwargs)[source]¶
A text classification dataset from the TensorFlow datasets catalog
- __init__(dataset_dir, dataset_name, split=['train'], shuffle_files=True, **kwargs)[source]¶
Class constructor
Methods
__init__
(dataset_dir, dataset_name[, split, ...])Class constructor
get_batch
([subset])Get a single batch of images and labels from the dataset.
get_inc_dataloaders
(hub_name, max_seq_length)get_str_label
(numerical_value)Returns the string label (class name) associated with the specified numerical value.
preprocess
(batch_size)Batch the dataset
shuffle_split
([train_pct, val_pct, ...])Randomly split the dataset into train, validation, and test subsets with a pseudo-random seed option.
Attributes
class_names
dataset
The framework dataset object
dataset_catalog
The string name of the dataset catalog (or None)
dataset_dir
Host directory containing the dataset files
dataset_name
Name of the dataset
info
test_subset
A subset of the dataset held out for final testing/evaluation
train_subset
A subset of the dataset used for training
validation_subset
A subset of the dataset used for validation/evaluation