Benchmark
-
3.1. Stock Pytorch Model
3.2. IPEX Model
3.3. Benchmark Output
Introduction
The Benchmark is used to measure the model performance with the objective settings. It is inherited from Intel® Neural Compressor Benchmark.
Get Started with Benchmark API
The class BenchmarkConfig
allows users to adjust the following parameters with objective settings to measure model performance:
backend
(str, optional): the backend used for benchmark. Defaults to “torch”. warmup
(int, optional): number of iters to skip when collecting latency. Defaults to 5. iteration
(int, optional): total iters when collecting latency. Defaults to 20. cores_per_instance
(int, optional): the core number for 1 instance. Defaults to 4. num_of_instance
(int, optional): the instance number. Defaults to -1. torchscript
(bool, optional): enable it if you want to jit trace it before benchmarking. Defaults to False. generate
(bool, optional): enable it if you want to use model generate when benchmarking. Defaults to False.
Note: Benchmark provides capability to automatically run with multiple instance through
cores_per_instance
andnum_of_instance
config (CPU only). Please make surecores_per_instance * num_of_instance
must be less than CPU physical core numbers.
Examples
Example inputs or a dataloader is required for benchmark.
def get_example_inputs(model_name, dataset_name='sst2'):
tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
dataset = load_dataset(dataset_name, split='validation')
text = dataset[0]['text'] if dataset_name=='lambada' else dataset[0]['sentence']
example_inputs = tokenizer(text, padding='max_length', max_length=195, return_tensors='pt')
return example_inputs
Stock Pytorch Model
from intel_extension_for_transformers.transformers import BenchmarkConfig
from intel_extension_for_transformers.transformers.benchmark import benchmark
config = BenchmarkConfig(
batch_size=16,
cores_per_instance=4,
num_of_instance=-1,
)
example_inputs = get_example_inputs(model_name_or_path)
benchmark(model_name_or_path, config, example_inputs=example_inputs)
IPEX Model
from intel_extension_for_transformers.transformers import BenchmarkConfig
from intel_extension_for_transformers.transformers.benchmark import benchmark
config = BenchmarkConfig(
backend='ipex',
batch_size=16,
cores_per_instance=4,
num_of_instance=-1,
)
example_inputs = get_example_inputs(model_name_or_path)
benchmark(model_name_or_path, config, example_inputs=example_inputs)
Benchmark Output
**********************************************
|****Multiple Instance Benchmark Summary*****|
+---------------------------------+----------+
| Items | Result |
+---------------------------------+----------+
| Latency average [second/sample] | 0.003 |
| Throughput sum [samples/second] | 5071.933 |
+---------------------------------+----------+