How to Run DLSA Inference Pipeline with HF Transformers(FP32, BF16)
Support Matrix
Categoty | Script |
---|---|
CPU Single Instance | cpu_single_instance.sh |
CPU Multi Instances | cpu_multi_instance.sh |
Note: Please use the fine-tuned model for correct accuracy. Just change the
MODEL_NAME_OR_PATH
in the script before you running. By default, theMODEL_NAME_OR_PATH
isbert-large-uncased
which is downloaded from the Hugging Face website.
Running on CPU
Single instance
./inference/cpu_single_instance.sh
By default, it will launch 1 instance to run inference with SST-2 dataset and FP32 precision. You can change the configurations in the file or pass parameters when running the script.
Below is the help message by using the command ./inference/cpu_single_instance.sh -h
:
Usage: ./inference/cpu_single_instance.sh [OPTIONS]
OPTION includes:
-l | --log_name - the log name of this round
-d | --dataset - [imdb|sst2] wether to use imdb or sst2 DATASET
-b | --batch_size - batch size per instance
-s | --sequence_len - max sequence length
--bf16 - whether using hf bf16 inference
--use_ipex - whether using ipex
-h | --help - displays this message
Multi-instance
./inference/cpu_multi_instance.sh
By default, it will launch 2 instances (1 instance/socket) to run inference with SST-2 dataset and FP32 precision. You can change the configurations in the file or pass parameters when running the script.
Below is the help message by using the command ./inference/cpu_multi_instance.sh -h
Usage: ./inference/cpu_multi_instance.sh [OPTIONS]
OPTION includes:
-l | --log_name - the log name of this round
-d | --dataset - [imdb|sst2] wether to use imdb or sst2 DATASET
-n | --num_of_ins_per_socket - number of instance per socket
-b | --batch_size - batch size per instance
-s | --sequence_len - max sequence length
--bf16 - whether using hf bf16 inference
--use_ipex - whether using ipex
-h | --help - displays this message