Performance

Overview

This page shows performance boost with Intel® Extension for PyTorch* on several popular topologies.

Performance Data for Intel® AI Data Center Products

Find the latest performance data for Intel® Data Center Max 1550 GPU, including detailed hardware and software configurations.

LLM Performance

We benchmarked GPT-J 6B, LLaMA2 7B, 13B, OPT 6.7B, Bloom-7B with test input token length set to 1024. The datatype is FP16 for all the models.

Single Tile

Single Card

Two Card

Four Card

Configuration

Software Version

Software Version
PyTorch v2.1
Intel® Extension for PyTorch* v2.1.10+xpu
Intel® oneAPI Base Toolkit 2024.0
Torch-CCL 2.1.100
GPU Driver 736.25
Transformers v4.31.0
DeepSpeed commit 4fc181b0
Intel® Extension for DeepSpeed* commit ec33277

Hardware Configuration

CPU Configuration:

CPU Intel(R) Xeon(R) Platinum 8480+ CPU
Number of nodes 1
Number of sockets 2
Cores/Socket 56
Threads/Core 2
uCode 0x2b0004b1
Hyper-Threading ON
TurboBoost ON
BIOS version SE5C7411.86B.9525.D25.2304190630
Number of DDR Memory slots 16
Capacity of DDR memory per slot 64GB
DDR frequency 4800
Total Memory/Node (DDR+DCPMM) 1024GB
Host OS Ubuntu 22.04.3 LTS
Host Kernel 5.17.0-1020-oem
Spectre-Meltdown Mitigation Mitigated

Single tile of 4X PVC OAM Configuration:

GPU Intel(R) Data Center Max 1550 GPU
IFWI PVC.PS.B4.P.Si.2023.WW42.3_25MHzi_Quad_DAMeni_OAM600W_IFRv2332i_PSCnull_IFWI.bin
ECC ON
AMC SW AMC FW 6.2
Precision FP16