Performance

Overview

This page shows performance boost with Intel® Extension for PyTorch* on several popular topologies.

Performance Numbers

Hardware	Workload¹	Precision	Throughput Inference²		Realtime Inference³		Model Type	Dataset	Misc.
Hardware	Workload¹	Precision	Batch Size	Boost Ratio	Batch Size	Boost Ratio	Model Type	Dataset	Misc.
Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz	ResNet50	Float32	80	1.39x	1	1.35x	Computer Vision	ImageNet	Input shape [3, 224, 224]
	SSD-ResNet34	Float32	160	1.55x	1	1.06x	Computer Vision	COCO	Input shape [3, 1200, 1200]
	ResNext 32x16d	Float32	80	1.08x	1	1.08x	Computer Vision	ImageNet	Input shape [3, 224, 224]
	Faster R-CNN ResNet50 FPN	Float32	80	1.71x	1	1.07x	Computer Vision	COCO	Input shape [3, 1200, 1200]
	VGG-11	Float32	160	1.20x	1	1.13x	Computer Vision	ImageNet	Input shape [3, 224, 224]
	ShuffleNetv2_x1.0	Float32	160	1.32x	1	1.20x	Computer Vision	ImageNet	Input shape [3, 224, 224]
	MobileNet v2	Float32	160	1.48x	1	1.12x	Computer Vision	ImageNet	Input shape [3, 224, 224]
	DLRM	Float32	80	1.11x	1	-	Recommendation	Terabyte	-
	BERT-Large	Float32	80	1.14x	1	1.02x	NLP	Squad	max_seq_len=384 Task: Question Answering
	Bert-Base	Float32	160	1.10x	1	1.33x	NLP	MRPC	max_seq_len=128 Task: Text Classification
Intel(R) Xeon(R) Platinum 8380H CPU @ 2.90GHz	BERT-Large	BFloat16	56	1.67x	1	1.45x	NLP	Squad	max_seq_len=384 Task: Question Answering
Intel(R) Xeon(R) Platinum 8380H CPU @ 2.90GHz	Bert-Base	BFloat16	112	1.77x	1	1.18x	NLP	MRPC	max_seq_len=128 Task: Text Classification

^{1. Model Zoo for Intel® Architecture}
^{2. Throughput inference runs with single instance per socket.}
^{3. Realtime inference runs with multiple instances, 4 cores per instance.}

Note: Performance numbers with stock PyTorch are measured with its most performant configuration.

Configuration

Software Version

Software	Version
PyTorch	v1.10.1
Intel® Extension for PyTorch*	v1.10.100

Hardware Configuration

	3rd Generation Intel® Xeon® Scalable Processors	Products formerly Cooper Lake
CPU	Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz	Intel(R) Xeon(R) Platinum 8380H CPU @ 2.90GHz

Number of nodes	1	1
Number of sockets	2	2
Cores/Socket	40	28
Threads/Core	2	2
uCode	0xd0002a0	0x700001c
Hyper-Threading	ON	ON
TurboBoost	ON	ON
BIOS version	04.12.02	WLYDCRB1.SYS.0016.P29.2006080250
Number of DDR Memory slots	16	12
Capacity of DDR memory per slot	16GB	64GB
DDR frequency	3200	3200
Total Memory/Node (DDR+DCPMM)	256GB	768GB
Host OS	CentOS Linux release 8.4.2105	Ubuntu 18.04.4 LTS
Host Kernel	4.18.0-305.10.2.el8_4.x86_64	4.15.0-76-generic
Docker OS	Ubuntu 18.04.5 LTS	Ubuntu 18.04.5 LTS
Spectre-Meltdown Mitigation	Mitigated	Mitigated