Performance =========== ## Overview This page shows performance boost with Intel® Extension for PyTorch\* on several popular topologies. ## Performance Numbers
Hardware Workload1 Precision Throughput Inference2 Realtime Inference3 Model Type Dataset Misc.
Batch Size Boost Ratio Batch Size Boost Ratio
Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz ResNet50 Float32 80 1.39x 1 1.35x Computer Vision ImageNet Input shape
[3, 224, 224]
SSD-ResNet34 Float32 160 1.55x 1 1.06x Computer Vision COCO Input shape
[3, 1200, 1200]
ResNext 32x16d Float32 80 1.08x 1 1.08x Computer Vision ImageNet Input shape
[3, 224, 224]
Faster R-CNN ResNet50 FPN Float32 80 1.71x 1 1.07x Computer Vision COCO Input shape
[3, 1200, 1200]
VGG-11 Float32 160 1.20x 1 1.13x Computer Vision ImageNet Input shape
[3, 224, 224]
ShuffleNetv2_x1.0 Float32 160 1.32x 1 1.20x Computer Vision ImageNet Input shape
[3, 224, 224]
MobileNet v2 Float32 160 1.48x 1 1.12x Computer Vision ImageNet Input shape
[3, 224, 224]
DLRM Float32 80 1.11x 1 - Recommendation Terabyte -
BERT-Large Float32 80 1.14x 1 1.02x NLP Squad max_seq_len=384
Task: Question Answering
Bert-Base Float32 160 1.10x 1 1.33x NLP MRPC max_seq_len=128
Task: Text Classification
Intel(R) Xeon(R) Platinum 8380H CPU @ 2.90GHz BERT-Large BFloat16 56 1.67x 1 1.45x NLP Squad max_seq_len=384
Task: Question Answering
Bert-Base BFloat16 112 1.77x 1 1.18x NLP MRPC max_seq_len=128
Task: Text Classification

1. Model Zoo for Intel® Architecture
2. Throughput inference runs with single instance per socket.
3. Realtime inference runs with multiple instances, 4 cores per instance.
*Note:* Performance numbers with stock PyTorch are measured with its most performant configuration. ## Configuration ### Software Version | Software | Version | | :-: | :-: | | PyTorch | [v1.10.1](https://pytorch.org/get-started/locally/) | | Intel® Extension for PyTorch\* | [v1.10.100](https://github.com/intel/intel-extension-for-pytorch/releases) | ### Hardware Configuration | | 3rd Generation Intel® Xeon® Scalable Processors | Products formerly Cooper Lake | | :-: | :-: | :-: | | CPU | Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz | Intel(R) Xeon(R) Platinum 8380H CPU @ 2.90GHz | | Number of nodes | 1 | 1 | | Number of sockets | 2 | 2 | | Cores/Socket | 40 | 28 | | Threads/Core | 2 | 2 | | uCode | 0xd0002a0 | 0x700001c | | Hyper-Threading | ON | ON | | TurboBoost | ON | ON | | BIOS version | 04.12.02 | WLYDCRB1.SYS.0016.P29.2006080250 | | Number of DDR Memory slots | 16 | 12 | | Capacity of DDR memory per slot | 16GB | 64GB | | DDR frequency | 3200 | 3200 | | Total Memory/Node (DDR+DCPMM) | 256GB | 768GB | | Host OS | CentOS Linux release 8.4.2105 | Ubuntu 18.04.4 LTS | | Host Kernel | 4.18.0-305.10.2.el8\_4.x86\_64 | 4.15.0-76-generic | | Docker OS | Ubuntu 18.04.5 LTS | Ubuntu 18.04.5 LTS | | [Spectre-Meltdown Mitigation](https://github.com/speed47/spectre-meltdown-checker) | Mitigated | Mitigated |