Getting Started =============== ## Installation The Intel® Neural Compressor library is released as part of the [Intel® oneAPI AI Analytics Toolkit](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html) (AI Kit). The AI Kit provides a consolidated package of Intel's latest deep learning and machine optimizations all in one place for ease of development. Along with Neural Compressor, the AI Kit includes Intel-optimized versions of deep learning frameworks (such as TensorFlow and PyTorch) and high-performing Python libraries to streamline end-to-end data science and AI workflows on Intel architectures. ### Linux Installation You can install just the library from binary or source, or you can get the Intel-optimized framework together with the library by installing the Intel® oneAPI AI Analytics Toolkit. #### Install from binary ```Shell # install from pip pip install neural-compressor # install from conda conda install neural-compressor -c conda-forge -c intel ``` #### Install from source ```Shell git clone https://github.com/intel/neural-compressor.git cd neural-compressor pip install -r requirements.txt python setup.py install ``` #### Install from AI Kit The AI Kit, which includes the library, is distributed through many common channels, including from Intel's website, YUM, APT, Anaconda, and more. Select and [download](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit/download.html) the AI Kit distribution package that's best suited for you and follow the [Get Started Guide](https://software.intel.com/content/www/us/en/develop/documentation/get-started-with-ai-linux/top.html) for post-installation instructions. |[Download AI Kit](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit/) |[AI Kit Get Started Guide](https://software.intel.com/content/www/us/en/develop/documentation/get-started-with-ai-linux/top.html) | |---|---| ### Windows Installation **Prerequisites** The following prerequisites and requirements must be satisfied for a successful installation: - Python version: 3.6 or 3.7 or 3.8 or 3.9 - Download and install [anaconda](https://anaconda.org/). - Create a virtual environment named nc in anaconda: ```shell # Here we install python 3.7 for instance. You can also choose python 3.6, 3.8, or 3.9. conda create -n nc python=3.7 conda activate nc ``` #### Install from binary ```Shell # install from pip pip install neural-compressor # install from conda conda install neural-compressor -c conda-forge -c intel ``` #### Install from source ```shell git clone https://github.com/intel/neural-compressor.git cd neural-compressor pip install -r requirements.txt python setup.py install ``` ## Examples [Examples](examples_readme.md) are provided to demonstrate the usage of Intel® Neural Compressor in different frameworks: TensorFlow, PyTorch, MXNet, and ONNX Runtime. Hello World examples are also available. ## Developer Documentation View Neural Compressor [Documentation](doclist.rst) for getting started, deep dive, and advanced resources to help you use and develop Neural Compressor. ## System Requirements Intel® Neural Compressor supports systems based on [Intel 64 architecture or compatible processors](https://en.wikipedia.org/wiki/X86-64), specially optimized for the following CPUs: * Intel Xeon Scalable processor (formerly Skylake, Cascade Lake, Cooper Lake, and Icelake) * future Intel Xeon Scalable processor (code name Sapphire Rapids) Intel® Neural Compressor requires installing the Intel-optimized framework version for the supported DL framework you use: TensorFlow, PyTorch, MXNet, or ONNX runtime. Note: Intel Neural Compressor supports Intel-optimized and official frameworks for some TensorFlow versions. Refer to [Supported Frameworks](../README.md#Supported-Frameworks) for specifics. ### Validated Hardware/Software Environment
| Platform | OS | Python | Framework | Version |
|---|---|---|---|---|
| Cascade Lake Cooper Lake Skylake Ice Lake |
CentOS 8.3 Ubuntu 18.04 |
3.6 3.7 3.8 3.9 |
TensorFlow | 2.5.0 |
| 2.4.0 | ||||
| 2.3.0 | ||||
| 2.2.0 | ||||
| 2.1.0 | ||||
| 1.15.0 UP1 | ||||
| 1.15.0 UP2 | ||||
| 1.15.0 UP3 | ||||
| 1.15.2 | ||||
| PyTorch | 1.5.0+cpu | |||
| 1.6.0+cpu | ||||
| 1.8.0+cpu | ||||
| IPEX | ||||
| MXNet | 1.7.0 | |||
| 1.6.0 | ||||
| ONNX Runtime | 1.6.0 | |||
| 1.7.0 | ||||
| 1.8.0 |
| Framework | version | Model | dataset | Accuracy | Performance speed up | ||
|---|---|---|---|---|---|---|---|
| INT8 Tuning Accuracy | FP32 Accuracy Baseline | Acc Ratio[(INT8-FP32)/FP32] | Realtime Latency Ratio[FP32/INT8] | ||||
| tensorflow | 2.4.0 | resnet50v1.5 | ImageNet | 76.70% | 76.50% | 0.26% | 3.23x |
| tensorflow | 2.4.0 | Resnet101 | ImageNet | 77.20% | 76.40% | 1.05% | 2.42x |
| tensorflow | 2.4.0 | inception_v1 | ImageNet | 70.10% | 69.70% | 0.57% | 1.88x |
| tensorflow | 2.4.0 | inception_v2 | ImageNet | 74.10% | 74.00% | 0.14% | 1.96x |
| tensorflow | 2.4.0 | inception_v3 | ImageNet | 77.20% | 76.70% | 0.65% | 2.36x |
| tensorflow | 2.4.0 | inception_v4 | ImageNet | 80.00% | 80.30% | -0.37% | 2.59x |
| tensorflow | 2.4.0 | inception_resnet_v2 | ImageNet | 80.10% | 80.40% | -0.37% | 1.97x |
| tensorflow | 2.4.0 | Mobilenetv1 | ImageNet | 71.10% | 71.00% | 0.14% | 2.88x |
| tensorflow | 2.4.0 | ssd_resnet50_v1 | Coco | 37.90% | 38.00% | -0.26% | 2.97x |
| tensorflow | 2.4.0 | mask_rcnn_inception_v2 | Coco | 28.90% | 29.10% | -0.69% | 2.66x |
| tensorflow | 2.4.0 | vgg16 | ImageNet | 72.50% | 70.90% | 2.26% | 3.75x |
| tensorflow | 2.4.0 | vgg19 | ImageNet | 72.40% | 71.00% | 1.97% | 3.79x |
| Framework | version | model | dataset | Accuracy | Performance speed up | ||
|---|---|---|---|---|---|---|---|
| INT8 Tuning Accuracy | FP32 Accuracy Baseline | Acc Ratio[(INT8-FP32)/FP32] | Realtime Latency Ratio[FP32/INT8] | ||||
| pytorch | 1.5.0+cpu | resnet50 | ImageNet | 75.96% | 76.13% | -0.23% | 2.63x |
| pytorch | 1.5.0+cpu | resnext101_32x8d | ImageNet | 79.12% | 79.31% | -0.24% | 2.61x |
| pytorch | 1.6.0a0+24aac32 | bert_base_mrpc | MRPC | 88.90% | 88.73% | 0.19% | 1.98x |
| pytorch | 1.6.0a0+24aac32 | bert_base_cola | COLA | 59.06% | 58.84% | 0.37% | 2.19x |
| pytorch | 1.6.0a0+24aac32 | bert_base_sts-b | STS-B | 88.40% | 89.27% | -0.97% | 2.28x |
| pytorch | 1.6.0a0+24aac32 | bert_base_sst-2 | SST-2 | 91.51% | 91.86% | -0.37% | 2.30x |
| pytorch | 1.6.0a0+24aac32 | bert_base_rte | RTE | 69.31% | 69.68% | -0.52% | 2.15x |
| pytorch | 1.6.0a0+24aac32 | bert_large_mrpc | MRPC | 87.45% | 88.33% | -0.99% | 2.73x |
| pytorch | 1.6.0a0+24aac32 | bert_large_squad | SQUAD | 92.85% | 93.05% | -0.21% | 2.01x |
| pytorch | 1.6.0a0+24aac32 | bert_large_qnli | QNLI | 91.20% | 91.82% | -0.68% | 2.69x |