Optimization Orchestration ============ 1. [Introduction](#introduction) 1.1. [One-shot](#one-shot) 2. [Orchestration Support Matrix](#orchestration-support-matrix) 3. [Get Started with Orchestration API ](#get-started-with-orchestration-api) 4. [Examples](#examples) ## Introduction Orchestration is the combination of multiple optimization techniques, either applied simultaneously (one-shot). Intel Neural Compressor supports arbitrary meaningful combinations of supported optimization methods under one-shot, such as pruning during quantization-aware training. ### One-shot Since quantization-aware training, pruning and distillation all leverage training process for optimization, we can achieve the goal of optimization through one shot training with arbitrary meaningful combinations of these methods, which often gain more benefits in terms of performance and accuracy than just one compression technique applied, and usually are as efficient as applying just one compression technique. The three possible combinations are shown below. - Pruning during quantization-aware training - Distillation with pattern lock pruning - Distillation with pattern lock pruning and quantization-aware training ## Orchestration Support Matrix
Orchestration Combinations Supported
One-shot Pruning + Quantization Aware Training
Distillation + Quantization Aware Training
Distillation + Pruning
Distillation + Pruning + Quantization Aware Training
## Get Started with Orchestration API Neural Compressor defines `Scheduler` class to automatically pipeline execute model optimization with one shot way. User instantiates model optimization components, such as quantization, pruning, distillation, separately. After that, user could append those separate optimization objects into scheduler's pipeline, the scheduler API executes them one by one. In following example it execute the distillation and pruning with one-shot way, the code is like below. ```python from neural_compressor.training import prepare_compression from neural_compressor.config import DistillationConfig, KnowledgeDistillationLossConfig, WeightPruningConfig distillation_criterion = KnowledgeDistillationLossConfig() d_conf = DistillationConfig(model, distillation_criterion) p_conf = WeightPruningConfig() compression_manager = prepare_compression(model=model, confs=[d_conf, p_conf]) compression_manager.callbacks.on_train_begin() train_loop: compression_manager.on_train_begin() for epoch in range(epochs): compression_manager.on_epoch_begin(epoch) for i, batch in enumerate(dataloader): compression_manager.on_step_begin(i) ...... output = model(batch) loss = ...... loss = compression_manager.on_after_compute_loss(batch, output, loss) loss.backward() compression_manager.on_before_optimizer_step() optimizer.step() compression_manager.on_step_end() compression_manager.on_epoch_end() compression_manager.on_train_end() model.save('./path/to/save') ``` ## Examples [Orchestration Examples](../../examples/README.html#orchestration)