Performance
Overview
This page shows performance boost with Intel® Extension for PyTorch* on several popular topologies.
Performance Data for Intel® AI Data Center Products
Find the latest performance data for Intel® Data Center Max 1550 GPU, including detailed hardware and software configurations.
LLM Performance v2.1.10
We benchmarked GPT-J 6B, LLaMA2 7B, 13B, OPT 6.7B, Bloom-7B with test input token length set to 1024. The datatype is FP16 for all the models.
Configuration
Software Version
Software | Version |
---|---|
PyTorch | v2.1 |
Intel® Extension for PyTorch* | v2.1.10+xpu |
Intel® oneAPI Base Toolkit | 2024.0 |
Torch-CCL | 2.1.100 |
GPU Driver | 736.25 |
Transformers | v4.31.0 |
DeepSpeed | commit 4fc181b0 |
Intel® Extension for DeepSpeed* | commit ec33277 |
Hardware Configuration
CPU Configuration:
CPU | Intel(R) Xeon(R) Platinum 8480+ CPU |
---|---|
Number of nodes | 1 |
Number of sockets | 2 |
Cores/Socket | 56 |
Threads/Core | 2 |
uCode | 0x2b0004b1 |
Hyper-Threading | ON |
TurboBoost | ON |
BIOS version | SE5C7411.86B.9525.D25.2304190630 |
Number of DDR Memory slots | 16 |
Capacity of DDR memory per slot | 64GB |
DDR frequency | 4800 |
Total Memory/Node (DDR+DCPMM) | 1024GB |
Host OS | Ubuntu 22.04.3 LTS |
Host Kernel | 5.17.0-1020-oem |
Spectre-Meltdown Mitigation | Mitigated |
Single tile of 4X PVC OAM Configuration:
GPU | Intel(R) Data Center Max 1550 GPU |
---|---|
IFWI | PVC.PS.B4.P.Si.2023.WW42.3_25MHzi_Quad_DAMeni_OAM600W_IFRv2332i_PSCnull_IFWI.bin |
ECC | ON |
AMC SW | AMC FW 6.2 |
Precision | FP16 |