Emulating Compiler Behavior#
When CBI processes a file, it tries to obey all of the arguments that it can see in the compilation database. Unfortunately, compilers often have behaviors that are not reflected on the command line (such as their default include paths, or compiler version macros).
If we believe (or already know!) that these behaviors will impact the divergence calculation for a code base, we can use a configuration file to instruct CBI to append additional options when emulating certain compilers.
Attention
If you encounter a situation that is not supported by CBI and which cannot be described by our existing configuration files, please open an issue.
Motivating Example#
The foo.cpp
files in our sample code base include specialization that we
have ignored so far, which selects a line based on the value of the
__GNUC__
preprocessor macro:
1// Copyright (c) 2024 Intel Corporation
2// SPDX-License-Identifier: 0BSD
3#include <cstdio>
4
5void foo() {
6#if __GNUC__ >= 13
7 printf("Using a feature that is only available in GCC 13 and later.\n");
8#else
9 printf("Running the rest of foo() on the CPU.\n");
10#endif
11}
This macro is defined automatically by all GNU compilers and is set based on
the compiler’s major version. For example, gcc
version 13.0.0 would set
__GNUC__
to 13. Checking the values of macros like this one can be
useful when specializing code paths to workaround bugs in specific compilers,
or when specializing code paths to make use of functionality that is only
available in newer compiler versions.
Let’s take another look at the compilation database entry for this file:
[
{
"directory": "/home/username/src/build-cpu",
"command": "/usr/bin/c++ -o CMakeFiles/tutorial.dir/main.cpp.o -c /home/username/src/main.cpp",
"file": "/home/username/src/main.cpp"
},
{
"directory": "/home/username/src/build-cpu",
"command": "/usr/bin/c++ -o CMakeFiles/tutorial.dir/third-party/library.cpp.o -c /home/username/src/third-party/library.cpp",
"file": "/home/username/src/third-party/library.cpp"
},
{
"directory": "/home/username/src/build-cpu",
"command": "/usr/bin/c++ -o CMakeFiles/tutorial.dir/cpu/foo.cpp.o -c /home/username/src/cpu/foo.cpp",
"file": "/home/username/src/cpu/foo.cpp"
}
]
CBI can see that the compiler used for foo.cpp
is called /usr/bin/c++
,
but there is not enough information to decide what the value of
__GNUC__
should be.
Defining Behaviors#
codebasin
searches for a file called .cbi/config
, and uses the
information found in that file to determine implicit compiler behavior. Each
compiler definition is a TOML table, of the form shown below:
[compiler.name]
options = [
"option",
"option"
]
In our example, we would like to define __GNUC__
for the c++
compiler, so we can add the following compiler definition:
[compiler."c++"]
options = [
"-D__GNUC__=13",
]
Important
The quotes around “c++” are necessary because of the + symbols. The quotes would not be necessary for other compilers.
With the __GNUC__
macro set, the two lines of code that were previously
considered “unused” are assigned to platforms, and the output of codebasin
becomes:
-----------------------
Platform Set LOC % LOC
-----------------------
{cpu} 8 29.63
{gpu} 8 29.63
{cpu, gpu} 11 40.74
-----------------------
Code Divergence: 0.59
Unused Code (%): 0.00
Total SLOC: 27