Transfer Learning for Image Classification using PyTorch and the Intel® Transfer Learning Tool API¶

This notebook uses the tlt library to do transfer learning for image classfication with a PyTorch pretrained model.

Intel® Gaudi® AI accelerator¶

To use HPU training and inference with Gaudi, follow these steps to install required HPU drivers and software from the official Habana Docs

Note: Installing the required HPU software will require the user to not have torch installed in their environment first. This can be achieved by running pip uninstall torch before following through with the Gaudi installation steps

1. Import dependencies and setup parameters¶

This notebook assumes that you have already followed the instructions to setup a PyTorch environment with all the dependencies required to run the notebook.

[ ]:

import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd
import PIL.Image as Image
import torch, torchvision
import requests
from io import BytesIO

# tlt imports
from tlt.datasets import dataset_factory
from tlt.models import model_factory
from tlt.utils.file_utils import download_and_extract_tar_file, download_file

# Specify a directory for the dataset to be downloaded
dataset_dir = os.environ["DATASET_DIR"] if "DATASET_DIR" in os.environ else \
    os.path.join(os.environ["HOME"], "dataset")

# Specify a directory for output
output_dir = os.environ["OUTPUT_DIR"] if "OUTPUT_DIR" in os.environ else \
    os.path.join(os.environ["HOME"], "output")

print("Dataset directory:", dataset_dir)
print("Output directory:", output_dir)

2. Get the model¶

In this step, we call the model factory to list supported PyTorch image classification models. This is a list of pretrained models from Torchvision and PyTorch Hub that we tested with our API. Optionally, the verbose=True argument can be added to the print_supported_models function call to get more information about each model (such as the classification layer, image size, the original dataset, etc).

[ ]:

# See a list of available models
model_factory.print_supported_models(use_case='image_classification', framework='pytorch')

Next, use the model factory to get one of the models listed in the previous cell. The get_model function returns a model object that will later be used for training.

[ ]:

# Set device="hpu" to use Gaudi. If no HPU hardware or installs are detected, device will default to "cpu"
model = model_factory.get_model(model_name='efficientnet_b1', framework='pytorch', device='cpu')

print("Model name:", model.model_name)
print("Framework:", model.framework)
print("Use case:", model.use_case)
print("Image size:", model.image_size)

3. Get the dataset¶

Option A: Use your own dataset¶

To use your own image dataset for transfer learning with the rest of this notebook, format your images as .jpg files and save them in folders named after the classes that you want the model to predict. To provide a working example using the correct layout, we will download a flower species dataset. After downloading and extracting, you will have the following subdirectories in your dataset directory. Each species subfolder will contain numerous .jpg files:

flower_photos
  └── daisy
  └── dandelion
  └── roses
  └── sunflowers
  └── tulips

When using your own dataset, ensure that it is similarly organized with folders for each class. Change the custom_dataset_path variable to point to your dataset folder.

[ ]:

# For demonstration purposes, we download a flowers dataset. To instead use your own dataset, set the
# custom_dataset_path to point to your dataset's directory and comment out the download_and_extract_tar_file line.
custom_dataset_path = os.path.join(dataset_dir, "flower_photos")

if not os.path.exists(custom_dataset_path):
    download_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"
    download_and_extract_tar_file(download_url, dataset_dir)

Call the dataset factory to load the dataset from the directory.

[ ]:

# Load the dataset from the custom dataset path
dataset = dataset_factory.load_dataset(dataset_dir=custom_dataset_path,
                                       use_case='image_classification',
                                       framework='pytorch')

print("Class names:", str(dataset.class_names))

Skip to the next step 4. Prepare the dataset to continue using the custom dataset.

Option B: Use a dataset from the PyTorch’s Torchvision Datasets catalog¶

To use a Torchvision dataset, specify the name of the dataset in the get_dataset function. This example uses the CIFAR10 dataset from the Torchvision datasets for image classification, but you can choose from a variety of options. If the dataset is not found in the dataset directory it will be downloaded. Subsequent runs will reuse the already downloaded dataset.

These Torchvision datasets are currently supported in the API: * CIFAR10 * Country211 * DTD * Food101 * FGVCAircraft * RenderedSST2

[ ]:

dataset = dataset_factory.get_dataset(dataset_dir=dataset_dir,
                                      use_case='image_classification',
                                      framework='pytorch',
                                      dataset_name='CIFAR10',
                                      dataset_catalog='torchvision')

print(dataset.info)

print("Class names:", str(dataset.class_names))

4. Prepare the dataset¶

Once you have your dataset from Option A or Option B above, use the following cells to split and preprocess the data. We split them into training and validation subsets, then resize the images to match the selected models, and then batch the images. Data augmentation can be appplied by specifying the augmentations to be applied in add_aug parameter. Supported augmentations are given below 1. hflip - RandomHorizontalFlip 2. rotate - RandomRotate

[ ]:

# Split the dataset into training and validation subsets
dataset.shuffle_split(train_pct=.75, val_pct=.25)

[ ]:

# Preprocess the dataset with an image size that matches the model and a batch size of 32
batch_size = 32
dataset.preprocess(model.image_size, batch_size=batch_size, add_aug=['hflip','rotate'])

5. Predict using the original model¶

We get a single batch from our dataset, and use that to call predict on our model. Since we haven’t done any training on the model yet, it will give us predictions using the original ImageNet trained model.

[ ]:

# Get a single batch from the dataset
images, labels = dataset.get_batch()
labels = [dataset.class_names[id] for id in labels]

# Download the ImageNet labels and load them into a list
labels_file = "https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt"
labels_file_path = os.path.join(dataset_dir, os.path.basename(labels_file))

if not os.path.exists(labels_file_path):
    download_file(labels_file, dataset_dir)

with open(labels_file_path) as f:
    imagenet_labels = f.readlines()
    imagenet_classes = [l.strip() for l in imagenet_labels]

# Predict using the original model
predictions = model.predict(images.to(model._device))
predictions = [imagenet_classes[id] for id in predictions]

[ ]:

# Display the images with the predicted ImageNet label
plt.figure(figsize=(18,14))
plt.subplots_adjust(hspace=0.5)
for n in range(min(batch_size, 30)):
    plt.subplot(6,5,n+1)
    inp = images[n]
    inp = inp.numpy().transpose((1, 2, 0))
    mean = np.array([0.485, 0.456, 0.406])
    std = np.array([0.229, 0.224, 0.225])
    inp = std * inp + mean
    inp = np.clip(inp, 0, 1)
    plt.imshow(inp)
    correct_prediction = labels[n] == predictions[n]
    color = "darkgreen" if correct_prediction else "crimson"
    title = predictions[n].title() if correct_prediction else "{}\n({})".format(predictions[n], labels[n])
    plt.title(title, fontsize=14, color=color)
    plt.axis('off')
_ = plt.suptitle("ImageNet predictions", fontsize=20)
plt.show()

print("Correct predictions are shown in green")
print("Incorrect predictions are shown in red with the actual label in parenthesis")

6. Transfer Learning¶

This step calls the model’s train function with the dataset that was just prepared. The training function will get the base model and add on a dense layer based on the number of classes in the dataset. The model is then compiled and trained based on the number of epochs specified in the argument. With the do_eval parameter set to True by default, this step will also show how the model can be evaluated. The model’s evaluate function returns a list of metrics calculated from the dataset’s validation subset.

Arguments¶

dataset (ImageClassificationDataset, required): Dataset to use when training the model
output_dir (str): Path to a writeable directory for checkpoint files
epochs (int): Number of epochs to train the model (default: 1)

initial_checkpoints (str): Path to checkpoint weights to load. If the path provided is a directory, the latest checkpoint will be used.
early_stopping (bool): Enable early stopping if convergence is reached while training at the end of each epoch. (default: False)
lr_decay (bool): If lr_decay is True and do_eval is True, learning rate decay on the validation loss is applied at the end of each epoch.
extra_layers (list[int]): Optionally insert additional dense layers between the base model and output layer. This can help increase accuracy when fine-tuning a TFHub model. The input should be a list of integers representing the number and size of the layers, for example [1024, 512] will insert two dense layers, the first with 1024 neurons and the second with 512 neurons.
device (str): Enter "cpu" or "hpu" to specify which hardware device to run training on. If device="hpu" is specified, but no HPU hardware or installs are detected, CPU will be used. (default: “cpu”)

Note: refer to release documentation for an up-to-date list of train arguments and their current descriptions

[ ]:

history = model.train(dataset, output_dir=output_dir, epochs=1, ipex_optimize=False)

A complete model summary can be printed for all modules in case any need to be unfrozen:

[ ]:

model.list_layers(verbose=True)

Layers can be unfrozen by passing their string names, such as the following:

[ ]:

model.unfreeze_layer("features") # Unfreezes the features layers
model.list_layers(verbose=True)

7. Predict¶

Lastly, we predict using the same single batch that we used earlier with the ImageNet trained model to visualize the model’s predictions after training.

[ ]:

# Predict with a single batch
predictions = model.predict(images.to(model._device))

# Map the predicted ids to the class names
predictions = [dataset.class_names[id] for id in predictions]

# Display the results
plt.figure(figsize=(16,16))
plt.subplots_adjust(hspace=0.5)
for n in range(min(batch_size, 30)):
    plt.subplot(6,5,n+1)
    inp = images[n]
    inp = inp.numpy().transpose((1, 2, 0))
    mean = np.array([0.485, 0.456, 0.406])
    std = np.array([0.229, 0.224, 0.225])
    inp = std * inp + mean
    inp = np.clip(inp, 0, 1)
    plt.imshow(inp)
    correct_prediction = labels[n] == predictions[n]
    color = "darkgreen" if correct_prediction else "crimson"
    title = predictions[n].title() if correct_prediction else "{}\n({})".format(predictions[n], labels[n])
    plt.title(title, fontsize=14, color=color)
    plt.axis('off')
_ = plt.suptitle("Model predictions", fontsize=16)
plt.show()
print("Correct predictions are shown in green")
print("Incorrect predictions are shown in red with the actual label in parenthesis")

Custom Single Image Prediction¶

We can also predict using a single image that wasn’t part of our original dataset. We download a flower image from the Open Images Dataset and then resize it to match our model.

[ ]:

# Download an image from the web and resize it to match our model
image_url = 'https://c8.staticflickr.com/8/7095/7210797228_c7fe51c3cb_z.jpg'

image_shape = (model.image_size, model.image_size)
daisy = Image.open(BytesIO(requests.get(image_url).content)).resize(image_shape)
daisy

Then, we call predict by passing the np array for our image and add a dimension to our array to represent the batch.

[ ]:

# Get the image as a np array and scale and normalize it
daisy = np.array(daisy)/255.0
daisy = (daisy - np.array([0.485, 0.456, 0.406])) / np.array([0.229, 0.224, 0.225])

# Arrange the channels with a batch dimension first (np.newaxis) and RGB channels second (np.moveaxis)
daisy = torch.Tensor(np.moveaxis(daisy, -1, 0))[np.newaxis, ...]

# Predict and print the class name
result = model.predict(daisy)
print(dataset.class_names[result[0]])

8. Export¶

Lastly, we can call the model export function to generate a saved_model.pb. Each time the model is exported, a new numbered directory is created, which allows serving to pick up the latest model.

[ ]:

saved_model_dir = model.export(output_dir)

9. Post-training quantization¶

In this section, the tlt API uses Intel® Neural Compressor (INC) to benchmark and quantize the model to get optimal inference performance.

We use the Intel Neural Compressor to benchmark the full precision model to see how it performs, as our baseline.

Please note that Benchmark and Quantization is only compatible with CPU models at this time, due to the IPEX backend

Note that there is a known issue when running Intel Neural Compressor from a notebook that you may sometimes see the error zmq.error.ZMQError: Address already in use. If you see this error, rerun the cell again.

[ ]:

results = model.benchmark(dataset=dataset)

Next we use Intel Neural Compressor to automatically search for the optimal quantization recipe for low-precision model inference within the accuracy loss constrains defined in the config. Running post training quantization may take several minutes, depending on your hardware and the exit policy (timeout and max trials).

You can customize a config by passing these parameters to get_inc_config(): * approach (str): The quantization approach (we recommend ‘static’ for image models and ‘dynamic’ for text models) * accuracy_criterion_relative (float): Relative accuracy loss (default: 0.01, which is 1%) * exit_policy_timeout (int): Tuning timeout in seconds (default: 0). Tuning processing finishes when the timeout or max_trials is reached. A tuning timeout of 0 means that the tuning phase stops when the accuracy criterion is met. * exit_policy_max_trials (int): Maximum number of tuning trials (default: 50). Tuning processing finishes when the timeout or or max_trials is reached.

[ ]:

from tlt.utils.inc_utils import get_inc_config

config = get_inc_config(approach='static',
                        accuracy_criterion_relative=0.01,
                        exit_policy_timeout=0,
                        exit_policy_max_trials=10)

[ ]:

inc_output_dir = os.path.join(output_dir, 'quantized_models', model.model_name,
                                       os.path.basename(saved_model_dir))
model.quantize(inc_output_dir, dataset=dataset, config=config)

Let’s benchmark using the quantized model, so that we can compare the performance to the full precision model that was originally benchmarked.

[ ]:

quantized_results = model.benchmark(dataset=dataset, saved_model_dir=inc_output_dir)

Dataset Citations¶

@ONLINE {tfflowers,
author = "The TensorFlow Team",
title = "Flowers",
month = "jan",
year = "2019",
url = "http://download.tensorflow.org/example_images/flower_photos.tgz" }

@ONLINE {CIFAR10,
author = "Alex Krizhevsky",
title = "CIFAR-10",
year = "2009",
url = "http://www.cs.toronto.edu/~kriz/cifar.html" }

@article{openimages,
  title={OpenImages: A public dataset for large-scale multi-label and multi-class image classification.},
  author={Krasin, Ivan and Duerig, Tom and Alldrin, Neil and Veit, Andreas and Abu-El-Haija, Sami
    and Belongie, Serge and Cai, David and Feng, Zheyun and Ferrari, Vittorio and Gomes, Victor
    and Gupta, Abhinav and Narayanan, Dhyanesh and Sun, Chen and Chechik, Gal and Murphy, Kevin},
  journal={Dataset available from https://github.com/openimages},
  year={2016}
}