Performance =========== ## Overview This page shows performance boost with Intel® Extension for PyTorch\* on several popular topologies. ## Performance Numbers
| Hardware | Workload1 | Precision | Throughput Inference2 | Realtime Inference3 | Model Type | Dataset | Misc. | ||
|---|---|---|---|---|---|---|---|---|---|
| Batch Size | Boost Ratio | Batch Size | Boost Ratio | ||||||
| Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz | ResNet50 | Float32 | 80 | 1.39x | 1 | 1.35x | Computer Vision | ImageNet | Input shape [3, 224, 224] |
| SSD-ResNet34 | Float32 | 160 | 1.55x | 1 | 1.06x | Computer Vision | COCO | Input shape [3, 1200, 1200] |
|
| ResNext 32x16d | Float32 | 80 | 1.08x | 1 | 1.08x | Computer Vision | ImageNet | Input shape [3, 224, 224] |
|
| Faster R-CNN ResNet50 FPN | Float32 | 80 | 1.71x | 1 | 1.07x | Computer Vision | COCO | Input shape [3, 1200, 1200] |
|
| VGG-11 | Float32 | 160 | 1.20x | 1 | 1.13x | Computer Vision | ImageNet | Input shape [3, 224, 224] |
|
| ShuffleNetv2_x1.0 | Float32 | 160 | 1.32x | 1 | 1.20x | Computer Vision | ImageNet | Input shape [3, 224, 224] |
|
| MobileNet v2 | Float32 | 160 | 1.48x | 1 | 1.12x | Computer Vision | ImageNet | Input shape [3, 224, 224] |
|
| DLRM | Float32 | 80 | 1.11x | 1 | - | Recommendation | Terabyte | - | |
| BERT-Large | Float32 | 80 | 1.14x | 1 | 1.02x | NLP | Squad | max_seq_len=384 Task: Question Answering |
|
| Bert-Base | Float32 | 160 | 1.10x | 1 | 1.33x | NLP | MRPC | max_seq_len=128 Task: Text Classification |
|
| Intel(R) Xeon(R) Platinum 8380H CPU @ 2.90GHz | BERT-Large | BFloat16 | 56 | 1.67x | 1 | 1.45x | NLP | Squad | max_seq_len=384 Task: Question Answering |
| Bert-Base | BFloat16 | 112 | 1.77x | 1 | 1.18x | NLP | MRPC | max_seq_len=128 Task: Text Classification |
|