Intel® Extension for TensorFlow* for C++
This guide shows how to build an Intel® Extension for TensorFlow* CC library from source and how to work with tensorflow_cc to build bindings for C/C++ languages on Ubuntu.
Requirements
Hardware Requirements
Verified Hardware Platforms:
Intel® CPU (Xeon, Core)
Intel® Arc™ Graphics (experimental)
Common Requirements
Install Bazel
To build Intel® Extension for TensorFlow*, install Bazel 5.3.0. Refer to install Bazel.
Here are the recommended commands:
$ wget https://github.com/bazelbuild/bazel/releases/download/5.3.0/bazel-5.3.0-installer-linux-x86_64.sh
$ bash bazel-5.3.0-installer-linux-x86_64.sh --user
Check Bazel is installed successfully and is version 5.3.0:
$ bazel --version
Download Source Code
$ git clone https://github.com/intel/intel-extension-for-tensorflow.git intel-extension-for-tensorflow
$ cd intel-extension-for-tensorflow/
Create a Conda Environment
Install Conda.
Create Virtual Running Environment
$ conda create -n itex_build python=3.10
$ conda activate itex_build
Note, we support Python versions 3.8 through 3.11.
Install TensorFlow
Install TensorFlow 2.14.0, and refer to Install TensorFlow for details.
$ pip install tensorflow==2.14.0
Check TensorFlow was installed successfully and is version 2.14.0:
$ python -c "import tensorflow as tf;print(tf.__version__)"
Extra Requirements for XPU/GPU Build Only
Install Intel GPU Driver
Install the Intel GPU Driver in the building server, which is needed to build with GPU support and AOT (Ahead-of-time compilation).
Refer to Install Intel GPU driver for details.
Note:
Make sure to install developer runtime packages before building Intel® Extension for TensorFlow*.
AOT (Ahead-of-time compilation)
AOT is a compiling option that reduces the initialization time of GPU kernels at startup time by creating the binary code for a specified hardware platform during compiling. AOT will make the installation package larger but improve performance time.
Without AOT, Intel® Extension for TensorFlow* will be translated to binary code for local hardware platform during startup. That will prolong startup time when using a GPU to several minutes or more.
For more information, refer to Use AOT for Integrated Graphics (Intel GPU).
Install oneAPI Base Toolkit
We recommend you install the oneAPI base toolkit using sudo
(or as root user) to the system directory /opt/intel/oneapi
.
The following commands assume the oneAPI base tookit is installed in /opt/intel/oneapi
. If you installed it in some other folder, please update the oneAPI path as appropriate.
Refer to Install oneAPI Base Toolkit Packages
The oneAPI base toolkit provides compiler and libraries needed by Intel® Extension for TensorFlow*.
Enable oneAPI components:
$ source /opt/intel/oneapi/compiler/latest/env/vars.sh
$ source /opt/intel/oneapi/mkl/latest/env/vars.sh
Build Intel® Extension for TensorFlow* CC library
Configure
Configure For CPU
Configure the system build by running the ./configure
command at the root of your cloned Intel® Extension for TensorFlow* source tree.
$ ./configure
Choose n
to build for CPU only. Refer to Configure Example.
Configure For GPU
Configure the system build by running the ./configure
command at the root of your cloned Intel® Extension for TensorFlow* source tree. This script prompts you for the location of Intel® Extension for TensorFlow* dependencies and asks for additional build configuration options (path to DPC++ compiler, for example).
$ ./configure
Choose
Y
for Intel GPU support. Refer to Configure Example.Specify the Location of Compiler (DPC++).
Default is
/opt/intel/oneapi/compiler/latest/linux/
, which is the default installed path. ClickEnter
to confirm default location.If it’s differenct, confirm the compiler (DPC++) installed path and fill the correct path.
Specify the Ahead of Time (AOT) Compilation Platforms.
Default is ‘’, which means no AOT.
Fill one or more device type strings of special hardware platforms, such as
ats-m150
,acm-g11
.Here is the list of GPUs we’ve verified:
GPU | device type |
---|---|
Intel® Data Center GPU Flex Series 170 | ats-m150 |
Intel® Data Center GPU Flex Series 140 | ats-m75 |
Intel® Data Center GPU Max Series | pvc |
Intel® Arc™ A730M | acm-g10 |
Intel® Arc™ A380 | acm-g11 |
To learn how to get the device type, please refer to Use AOT for Integrated Graphics (Intel GPU) or create an issue to ask support.
Choose to Build with oneMKL Support.
We recommend choosing
y
.Default is
/opt/intel/oneapi/mkl/latest
, which is the default installed path. ClickEnter
to confirm default location.If it’s wrong, please confirm the oneMKL installed path and fill the correct path.
Build Source Code
For GPU support
$ bazel build -c opt --config=gpu //itex:libitex_gpu_cc.so
CC library location: <Path to intel-extension-for-tensorflow>/bazel-bin/itex/libitex_gpu_cc.so
NOTE: libitex_gpu_cc.so
is depended on libitex_gpu_xetla.so
, so libitex_gpu_xetla.so
shoule be copied to the same diretcory of libitex_gpu_cc.so
$ cd <Path to intel-extension-for-tensorflow>
$ cp bazel-out/k8-opt-ST-*/bin/itex/core/kernels/gpu/libitex_gpu_xetla.so bazel-bin/itex/
For CPU support
$ bazel build -c opt --config=cpu //itex:libitex_cpu_cc.so
If you want to build with threadpool, you should add buid options --define=build_with_threadpool=true
and environment variables ITEX_OMP_THREADPOOL=0
$ bazel build -c opt --config=cpu --define=build_with_threadpool=true //itex:libitex_cpu_cc.so
CC library location: <Path to intel-extension-for-tensorflow>/bazel-bin/itex/libitex_cpu_cc.so
NOTE: libitex_cpu_cc.so
is depended on libiomp5.so
, so libiomp5.so
shoule be copied to the same diretcory of libitex_cpu_cc.so
$ cd <Path to intel-extension-for-tensorflow>
$ cp bazel-out/k8-opt-ST-*/bin/external/llvm_openmp/libiomp5.so bazel-bin/itex/
Prepare Tensorflow* CC library and header files
Option 1: Extract from Tensorflow* python package (Recommended)
a. Download Tensorflow* 2.14.0 python package
$ wget https://files.pythonhosted.org/packages/09/63/25e76075081ea98ec48f23929cefee58be0b42212e38074a9ec5c19e838c/tensorflow-2.14.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
b. Unzip Tensorflow* python package
$ unzip tensorflow-2.14.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -d tensorflow_src
c. Create symbolic link
$ cd ./tensorflow_src.13.0/tensorflow/
$ ln -s libtensorflow_cc.so.2 libtensorflow_cc.so
$ ln -s libtensorflow_framework.so.2 libtensorflow_framework.so
libtensorflow_cc.so location: <Path to tensorflow_src>/tensorflow/libtensorflow_cc.so
libtensorflow_framework.so location: <Path to tensorflow_src>/tensorflow/libtensorflow_framework.so
Tensorflow header file location: <Path to tensorflow_src>/tensorflow/include
Option 2: Build from TensorFlow* source code
a. Prepare TensorFlow* source code
$ git clone https://github.com/tensorflow/tensorflow.git
$ cd tensorflow
$ git checkout origin/r2.14 -b r2.14
b. Build libtensorflow_cc.so
$ ./configure
$ bazel build --jobs 96 --config=opt //tensorflow:libtensorflow_cc.so
$ ls ./bazel-bin/tensorflow/libtensorflow_cc.so
libtensorflow_cc.so location: <Path to tensorflow>/bazel-bin/tensorflow/libtensorflow_cc.so
c. Create symbolic link for libtensorflow_framework.so
$ cd ./bazel-bin/tensorflow/
$ ln -s libtensorflow_framework.so.2 libtensorflow_framework.so
libtensorflow_framework.so location: <Path to tensorflow>/bazel-bin/tensorflow/libtensorflow_framework.so
c. Build Tensorflow header files
$ bazel build --config=opt tensorflow:install_headers
$ ls ./bazel-bin/tensorflow/include
Tensorflow header file location: <Path to tensorflow>/bazel-bin/tensorflow/include
Integrate the CC library
Linker
Configure the linker environmental variables with Intel® Extension for TensorFlow* CC library (libitex_gpu_cc.so or libitex_cpu_cc.so) path:
$ export LIBRARY_PATH=$LIBRARY_PATH:<Path to intel-extension-for-tensorflow>/bazel-bin/itex/
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<Path to intel-extension-for-tensorflow>/bazel-bin/itex/
Load
TensorFlow* has C API: TF_LoadPluggableDeviceLibrary
to support the pluggable device library.
To support Intel® Extension for TensorFlow* cc library, we need to modify the original C++ code:
a. Add the header file: "tensorflow/c/c_api_experimental.h"
.
#include "tensorflow/c/c_api_experimental.h"
b. Load libitex_gpu_cc.so or libitex_cpu_cc.so by TF_LoadPluggableDeviceLibrary
.
TF_Status* status = TF_NewStatus();
TF_LoadPluggableDeviceLibrary(<lib_path>, status);
Example
The original simple example for using TensorFlow* C++ API.
// example.cc
#include "tensorflow/cc/client/client_session.h"
#include "tensorflow/cc/ops/standard_ops.h"
#include "tensorflow/core/framework/tensor.h"
int main() {
using namespace tensorflow;
using namespace tensorflow::ops;
Scope root = Scope::NewRootScope();
auto X = Variable(root, {5, 2}, DataType::DT_FLOAT);
auto assign_x = Assign(root, X, RandomNormal(root, {5, 2}, DataType::DT_FLOAT));
auto Y = Variable(root, {2, 3}, DataType::DT_FLOAT);
auto assign_y = Assign(root, Y, RandomNormal(root, {2, 3}, DataType::DT_FLOAT));
auto Z = Const(root, 2.f, {5, 3});
auto V = MatMul(root, assign_x, assign_y);
auto VZ = Add(root, V, Z);
std::vector<Tensor> outputs;
ClientSession session(root);
// Run and fetch VZ
TF_CHECK_OK(session.Run({VZ}, &outputs));
LOG(INFO) << "Output:\n" << outputs[0].matrix<float>();
return 0;
}
The updated example with Intel® Extension for TensorFlow* enabled
// example.cc
#include "tensorflow/cc/client/client_session.h"
#include "tensorflow/cc/ops/standard_ops.h"
#include "tensorflow/core/framework/tensor.h"
+ #include "tensorflow/c/c_api_experimental.h"
int main() {
using namespace tensorflow;
using namespace tensorflow::ops;
+ TF_Status* status = TF_NewStatus();
+ string xpu_lib_path = "libitex_gpu_cc.so";
+ TF_LoadPluggableDeviceLibrary(xpu_lib_path.c_str(), status);
+ TF_Code code = TF_GetCode(status);
+ if ( code == TF_OK ) {
+ LOG(INFO) << "intel-extension-for-tensorflow load successfully!";
+ } else {
+ string status_msg(TF_Message(status));
+ LOG(WARNING) << "Could not load intel-extension-for-tensorflow, please check! " << status_msg;
+ }
Scope root = Scope::NewRootScope();
auto X = Variable(root, {5, 2}, DataType::DT_FLOAT);
auto assign_x = Assign(root, X, RandomNormal(root, {5, 2}, DataType::DT_FLOAT));
auto Y = Variable(root, {2, 3}, DataType::DT_FLOAT);
auto assign_y = Assign(root, Y, RandomNormal(root, {2, 3}, DataType::DT_FLOAT));
auto Z = Const(root, 2.f, {5, 3});
auto V = MatMul(root, assign_x, assign_y);
auto VZ = Add(root, V, Z);
std::vector<Tensor> outputs;
ClientSession session(root);
// Run and fetch VZ
TF_CHECK_OK(session.Run({VZ}, &outputs));
LOG(INFO) << "Output:\n" << outputs[0].matrix<float>();
return 0;
}
Build and run
Place a Makefile
file in the same directory of example.cc
with the following contents:
Replace
<TF_INCLUDE_PATH>
with local Tensorflow* header file path. e.g.<Path to tensorflow_src>/tensorflow/include
Replace
<TFCC_PATH>
with local Tensorflow* CC library path. e.g.<Path to tensorflow_src>/tensorflow/
// Makefile
target = example_test
cc = g++
TF_INCLUDE_PATH = <TF_INCLUDE_PATH>
TFCC_PATH = <TFCC_PATH>
include = -I $(TF_INCLUDE_PATH)
lib = -L $(TFCC_PATH) -ltensorflow_framework -ltensorflow_cc
flag = -Wl,-rpath=$(TFCC_PATH) -std=c++17
source = ./example.cc
$(target): $(source)
$(cc) $(source) -o $(target) $(include) $(lib) $(flag)
clean:
rm $(target)
run:
./$(target)
Go to the directory of example.cc and Makefile, then build and run example.
$ make
$ ./example_test
NOTE: For GPU support, please set up oneapi environment variables before running the example.
$ source /opt/intel/oneapi/compiler/latest/env/vars.sh
$ source /opt/intel/oneapi/mkl/latest/env/vars.sh