Accelerating Deep Learning Inference for Model Zoo Workloads on Intel CPU and GPU¶

Introduction¶

This example shows the guideline to run Model Zoo workloads on Intel CPU and GPU with the optimizations from Intel® Extension for TensorFlow*, without any model code changes.

Prerequisites¶

For Intel CPU, refer to Intel CPU software installation. For Intel GPU, refer to Intel GPU software installation.

Execute¶

Prepare the Codes¶

git clone https://github.com/IntelAI/models
cd models
git checkout v2.8.0

Sample Use cases¶

Model	Mode	Model Documentation
Inception V3	Inference	FP32 INT8
Inception V4	Inference	FP32 INT8
ResNet50 V1.5	Inference	FP32 INT8

Performance Optimization¶

FP16/BF16 INT8 Inference Optimization

Refer to the above FP32 model documentation, and only set one extra environment variable to enable advanced auto mixed precision Graph optimization before running inference.
```
export ITEX_AUTO_MIXED_PRECISION=1
```

INT8 Inference Optimization

To avoid memory copy on GPU, we provide a tool to convert the const to host const for INT8 pretrained-models.

Take the ResNet50 v1.5 INT8 pb for example,

wget https://storage.googleapis.com/intel-optimized-tensorflow/models/v1_8/resnet50v1_5_int8_pretrained_model.pb

python host_const.py -i <path to the frozen graph downloaded above>/resnet50v1_5_int8_pretrained_model.pb -b -o <path to save the converted frozen graph>/resnet50v1_5_int8_pretrained_model-hostconst.pb

Use the new INT8 pb for INT8 inference, After converting to the new INT8 pb.

FAQ¶

During the Inception V3 INT8 batch inference, if running with real data, you might encounter a message “Running out of images from dataset”. It is a known issue of Model Zoo script.

Solution:

Option 1: Please use dummy data instead.
Option 2: If you want to run inference with real data, use the command below. And comment the last line of below int8_batch_inference.sh script to unspecify the warmup_steps and steps.

cd models
vi ./quickstart/image_recognition/tensorflow/inceptionv3/inference/cpu/int8/int8_batch_inference.sh