Datasets

This is a comprehensive list of public datasets used by this repository.

Name (Link/Source)

Framework

Use Case

AG News (Hugging Face)

PyTorch

Text Classification

AG News (TFDS)

TensorFlow

Text Classification

Food101 (Torchvision)

PyTorch

Image Classification

Food101 (TFDS)

TensorFlow

Image Classification

SMS Spam Collection

PyTorch & TensorFlow

Text Classification

TF Flowers (TFDS)

PyTorch & TensorFlow

Image Classification

Cats vs. Dogs (TFDS)

TensorFlow

Image Classification

Country211 (Torchvision)

PyTorch

Image Classification

DTD (Torchvision)

PyTorch

Image Classification

FGVCAircraft (Torchvision)

PyTorch

Image Classification

RenderedSST2 (Torchvision)

PyTorch

Image Classification

Rock Paper Scissors (TFDS)

TensorFlow

Image Classification

Rotten_Tomatoes (Hugging Face)

PyTorch

Text Classification

TweetEval (Hugging Face)

PyTorch

Text Classification

CIFAR10 (Torchvision)

PyTorch

Image Classification

IMDB Reviews (Hugging Face)

PyTorch

Text Classification

IMDB Reviews (TFDS)

TensorFlow

Text Classification

GLUE/SST2 (TFDS)

TensorFlow

Text Classification

GLUE/COLA (TFDS)

TensorFlow

Text Classification

Colorectal Histology (TFDS)

TensorFlow

Image Classification

RESISC45 (TFDS)

TensorFlow

Image Classification

CDD-CESM

PyTorch & TensorFlow

Image & Text Classification

SQuAD

PyTorch & TensorFlow

Text Classification

MVTec

PyTorch

Anomaly Detection

Code Alpaca

PyTorch

Text Generation

Dolly-15k

PyTorch

Text Generation

finance-alpaca

HuggingFace

Text Generation

Medical Meadow

PyTorch

Text Generation

RedPajama-Data

PyTorch

Text Generation