Skip to the content.

How to Run DLSA Multi Instance Fine-Tuning with IPEX (FP32, BF16)

Install MPI library:

Install MPI from here

MPI is included in the Intel OneAPI Toolkit. It’s recommended to use the package manager to install.

Note: This step should be operated on all the work nodes

To run:

source /opt/intel/oneapi/mpi/latest/env/vars.sh
cd profiling-transformers

Note:

np: num process, means how many processes you will run on a cluster

ppn: process per node, means how many processes you will run on 1 worker node.

For example, if I want to run on 2 nodes, each node runs with 1 process, use the config -np 2 -ppn 1

if I want to run on 4 nodes, each node runs with 2 processes, use the config -np 8 -ppn 2

Running single process in single node

bash fine-tuning/run_dist.sh -np 1 -ppn 1 bash fine-tuning/run_ipex_native.sh

Running multi instances in single node

# Run 2 instances in single node
bash fine-tuning/run_dist.sh -np 2 -ppn 2 bash fine-tuning/run_ipex_native.sh

Running with IPEX BF16

Before you run BF16 fine-tuning, you need to verify whether your server supports BF16. (Only Copper Lake & Sapphire Rapids CPUs support BF16)

add --bf16_ipex_ft at the end of the command:

bash fine-tuning/run_dist.sh -np 2 -ppn 2 bash fine-tuning/run_ipex_native.sh --bf16_ipex_ft 1