This document describes some tools that are used to debug Simics, and modules and software developed by the user that is run with Simics.
However, due to that wide definition and the numerous use cases the purpose of this document is to introduce the basics and provide a quick-start.
The tools that are described are the module
state-assertion which comes with the Simics
installation, the two third party tools,
GDB and Valgrind, and the internal Simics memory
tracking system.
A good way to pre-empt troublesome debugging sessions — which often coincide with critical deadlines — is the use of efficient testing. Simics Model Builder provides you with such support, see the Model Builder User's Guide.
This chapter describes how to use state-assertion, which is a module that comes with Simics.
In short, state-assertion is used by running the same configuration in two different Simics sessions and comparing the state of the two Simics sessions at specified intervals. It is the attributes of the verified objects that are compared and any difference will instantly cause an alert to the user.
In particular, state-assertion is used to verify checkpointability, which is a key feature in Simics and most useful while debugging Simics, user modules, or target software.
Furthermore, state-assertion can be used to find out where the execution begins to differ after changes in target software or changes in user models.
Simics can be run under state-assertion in two ways: save the evolution to file and run the second Simics using that file, or, run a second Simics that receives the states over a network connection.
A short overview for running Simics under state-assertion:
Start Simics, load the configuration, either run the simulation to some point prior to the interesting area to verify and take a checkpoint, or start state-assertion right away.
simics> load-module state-assertion simics> state-assertion-create-file compression = gz file = /tmp/test.sa Creating file '/tmp/test.sa' with compression 'gz' [state-assertion] File created successfully. sa0 created. You probably want to add some objects or memory space now with 'add' and 'add-mem-lis', then run 'start' to begin the assertion process. simics> sa0.add obj = "board.mb.cpu0.core[0][0]" steps = 1000000 [state-assertion] Added board.mb.cpu0.core[0][0] (x86QSP1) with save type 1(every 1000000 steps) - version 0.0 simics> sa0.start [state-assertion] Started simics> c
running> stop simics> sa0.stop [state-assertion] Stopped simics> quit
simics> load-module state-assertion simics> state-assertion-open-file file = /tmp/test.sa compression = gz Opening file '/tmp/test.sa' with compression 'gz' [state-assertion] File opened successfully. sa0 opened. You should run 'start' to begin the assertion process. simics> sa0.start [state-assertion] Added board.mb.cpu0.core[0][0] (x86QSP1) with save type 1(every 1000000 steps) - version 0.0 [state-assertion] Started simics> c
[state-assertion::assert] object: 0 - board.mb.cpu0.core[0][0], timestamp 636000000 Name Assert value Current value rax 0x0000000040010100 0x00000000df349d68 <-- diff rbx 0x00000000ddc65398 0x00000000ddc50398 <-- diff ... Difference found while asserting
Normally you want to start with a large interval, but once a difference is found, restart the state-assertion at the most recent time to be known as correct but this time with steps = 1 to pin-point where the difference in states first occurs. Once that is done, re-run the last one or two steps with maximum log level to find out what is happening during that step and debug the execution normally.
Rather than save to file it is possible to run two Simics instances, connected over the network, one acting as a "producer" and one as a "consumer":
simics2> load-module state-assertion simics2> state-assertion-receive port = 4711 [state-assertion] Waiting for connection...
simics1> load-module state-assertion simics1> state-assertion-connect port = 4711 [state-assertion] File created successfully. sa0 connected. You probably want to add some objects or memory space now with 'add' and 'add-mem-lis', then run 'start' to begin the assertion process.
simics2> sa0.start
Notice that the prompt will not return until after the first number of steps are executed in the producing Simics instance.
simics1> sa0.add obj = "board.mb.cpu0.core[0][0]" steps = 1000000 [state-assertion] Added board.mb.cpu0.core[0][0] (x86QSP1) with save type 1(every 1000000 steps) - version 0.0 simics1> sa0.start simics1> c
simics2> c
Even though errors found while using state-assertion may seem trivial or non-important, it is always better to address them as soon as possible. Should you later have to contact Simics Support, it may be necessary to provide a useful checkpoint to enable the support engineers to reproduce the issue.
This chapter describes how to debug errors in DML models in Simics
Environment.
The programming language DML is designed to make it easy to develop
device models. The DML compiler, called dmlc, translates
code written in DML into C source code. However, users who want to
debug device models written in DML do not need to look into the details
of the generated C code as it is possible to directly debug the DML code.
Debugging devices written in DML is very similar to debugging C code.
In this document the GDB shipped with the Simics GDB (#1031)
package is used.
GDB (see http://www.gnu.org/software/gdb/)
is an open source, general purpose debugger that allows you to follow
the execution of a program that runs "inside" GDB, inspect variables,
and many other possibilities described in GDB's on-line manual.
Simics is compiled with a modern GCC version which contains Dwarf
version 4 debugging information. We currently extend GDB 12.1 to
include such information and DML knowledge.
With the GDB shipped in Simics GDB (#1031) you can:
A short guide for running Simics under GDB:
In the following sections, we will show you how to debug Simics models (written in DML) from Simics CLI.
More information on how to build the modules can be found in the Model Builder User's Guide (the chapter on Build Environment), and the DML 1.2 Reference Manual (the chapter on Running DMLC).
This section prepares a simple synthetic example that will cause a segmentation fault and as a result will crash Simics. In the following session, we will use the example to show a debugging process.
joe@computer:~$ <simics-installation>/bin/project-setup proj
joe@computer:~: cd proj
joe@computer:~/proj: bin/project-setup --copy-module=simple-broken-device-gdb
This copies a sample device to the folder modules/simple-broken-device-gdb/.
The device is defined by the simple-broken-device-gdb.dml file:
method two(int *val) {
local int bar = *val;
log info: "method \"two\" called %d", bar;
}
method one() {
local int *foo = NULL;
log info: "method \"one\" called";
two(foo);
}
attribute int_attr is int64_attr "An integer attribute" {
method set(attr_value_t val) throws {
default(val);
after_set();
}
method after_set() throws {
log info: "attribute int_attr updated";
one();
}
}
bank regs {
register reg size 4 @ 1000 {
param init_val = 54;
method set(uint64 val) {
default(val);
}
}
register reg_array[i < 4] size 4 @ 2000 + i * 4 {
param init_val = 78 + i;
}
group reg_group[i < 4] {
register reg_array[j < 4] size 4 @ 3000 + i * 20 + j * 4 {
param init_val = 78 + i * 10 + j;
}
}
}
Notice how function "one" will call function "two" with a NULL
pointer, which will definitely cause a crash at the line
local int bar = *val.
joe@computer:~/proj: make
=== Building module "simple-broken-device-gdb" ===
...
joe@computer:~/proj: ./simics
simics> load-module simple-broken-device-gdb
simics> @SIM_create_object("simple_broken_device_gdb", "trbl")
<the simple_broken_device_gdb 'trbl'>
simics> trbl->int_attr = 4711
[trbl info] attribute int_attr updated
[trbl info] method "one" called
Segmentation fault (SIGSEGV) in main thread
#0 0x00007fd04065c448 (...proj/linux64/lib/simple-broken-device-gdb.so + 0x1448)
#1 0x00007fd05588ad67 (...simics/linux64/bin/libsimics-common.so + 0x117d67)
...
The error caused a SIGSEGV and the stack trace points to our own module
as the current frame. With tools such as nm and objdump
it may be possible to pin-point the line in the source code, but with
GDB we may monitor the execution and find a suspect.
In this session, we will show the detailed debugging process. Note that
the project makefile compiles optimized modules by default, hence we need
to recompile the module with the proper compilation options for a better
debugging experience. The GDB shipped with Simics has been extended with
DML knowledge. Use bin/gdb to run it.
joe@computer:~/proj: make clean
Clean of all modules
joe@computer:~/proj: make D=1
=== Building module "simple-broken-device-gdb" ===
...
joe@computer:~/proj: ./simics
simics> load-module simple-broken-device-gdb
simics> @SIM_create_object("simple_broken_device_gdb", "trbl")
<the simple_broken_device_gdb 'trbl'>
simics> pid
12345
joe@computer:~/proj: ./bin/gdb --pid 12345
Function names can be referenced the same way as in DML:
(gdb) break one
Breakpoint 1 at 0x76ced68: file ...proj/modules/simple-broken-device-gdb/
simple-broken-device-gdb.dml, line 34.
dev.symbol or this.symbol,
depending on the context.
shared implementation (defined in a template).
This means that you are not able to, for example, break on the default
implementations of get or set of registers.
A work-around for this is to provide a non-shared implementation
of the method you wish to break on in the source file, by simply
overriding that method and calling the default implementation:
register reg size 4 @ 1000 {
param init_val = 54;
method set(uint64 val) {
default(val);
}
}
As the overriding method implementation is not shared, it is
possible to break on it.
(gdb) c
Continuing.
trbl->int_attr at CLI:
simics> trbl->int_attr = 4711
[trbl info] attribute int_attr update
In GDB we read
Breakpoint 1, one (_dev=0x9eb4ee0)
at ...proj/modules/simple-broken-device-gdb/simple-broken-device-gdb.dml:34
34 local int *foo = NULL;
...
36 two(foo);
(gdb) print foo
$1 = (int *) 0x0
Not good, a NULL pointer. Another step and we face the SIGSEGV:
(gdb) n
Program received signal SIGSEGV, Segmentation fault.
0x076cedf3 in two (_dev=0x9eb4ee0, val=0x0)
at ...proj/modules/simple-broken-device-gdb/simple-broken-device-gdb.dml:29
29 local int bar = *val;
The crash happened at line 29 in the code of our module.
Backtrace and climb up the stack frame and inspect the variables in each frame:
(gdb) backtrace
#0 0x076cedf3 in two (_dev=0x9eb4ee0, val=0x0)
at ...proj/modules/simple-broken-device-gdb/simple-broken-device-gdb.dml:29
#1 0x076cedbc in one (_dev=0x9eb4ee0)
at ...proj/modules/simple-broken-device-gdb/simple-broken-device-gdb.dml:36
#2 0x076cec97 in int_attr.after_set (_dev=0x9eb4ee0)
at ...proj/modules/simple-broken-device-gdb/simple-broken-device-gdb.dml:46
...
(gdb) p val
$2 = (int *) 0x0
(gdb) up
#1 0x076cedbc in one (_dev=0x9eb4ee0)
at ...proj/modules/simple-broken-device-gdb/simple-broken-device-gdb.dml:36
36 two(foo);
(gdb) list
31 }
32
33 method one() {
34 local int *foo = NULL;
35 log info: "method \"one\" called";
36 two(foo);
37 }
38
39 attribute int_attr is int64_attr "An integer attribute" {
40 method set(attr_value_t val) throws {
(gdb) p foo
$3 = (int *) 0x0
This is a trivial example and ends here. Next steps for a normal debug session
would be to figure out why foo was assigned a NULL value.
Valgrind (see http://valgrind.org) is an open-source
tool for memory debugging, memory leak detection, and profiling Linux programs.
A short guide for running Simics under Valgrind:
valgrind-support Simics module../simics run
./bin/valgrind-simics.If you downloaded the tarball from http://valgrind.org, run the customary
./configure && make && make install
If you have installed the pre-built package that comes with your linux distribution, make sure you also install the development package as you will need it when compiling the Simics valgrind-support module.
As Valgrind instruments the code that executes under its supervision,
we need to restrict Valgrind to not instrument the JIT code generated by Simics
(the JIT code produced by Simics does not tolerate being changed).
This can be achieved by loading the valgrind-support module into
Simics. The binary interface for telling Valgrind that it should ignore certain
regions of memory changes between different versions of Valgrind. This is
why you should compile your own version of valgrind-support for the
Valgrind-version which you are using.
project$ bin/project-setup --copy-module=valgrind-support project$ make valgrind-support
valgrind.h
header that comes with the valgrind tarball or the valgrind development
package
To start Simics under Valgrind you should use the
./bin/valgrind-simics script.
This script sets up the needed environment similar to
./simics but
instead starts Valgrind wrapping the Simics binary.
The script automatically passes the following arguments to Valgrind:
--tool=memcheck --suppressions=$HOSTSDIR/scripts/simics-valgrind.supp --soname-synonyms=somalloc=NONE
These options tell Valgrind to use the memcheck tool for detecting
memory errors, to suppress false positives in Simics use of embedded python and to handle ovarloaded new/delete in C++ code.
Run Simics with Valgrind loading a target script:
project$ bin/valgrind-simics targets/qsp-x86/firststeps.simics
To override the default arguments to Valgrind, it is possible to set the
VALGRIND_OPTIONS environment variable before starting
valgrind-simics.
project$ env VALGRIND_OPTIONS="--num-callers=20--suppressions=<simics-installation>/scripts/simics-valgrind.supp
--tool=memcheck" bin/valgrind-simics
Simics comes with an example device with multiple errors in it. The device can be used to get acquainted with Valgrind. Copy it to your project with the following command:
project> bin/project-setup --copy-module=simple-broken-device-valgrind
This is what it looks like:
dml 1.2;
device simple_broken_device_valgrind;
// short, one-line description
parameter desc = "sample broken device for Valgrind example";
// long description
parameter documentation = "This is a very broken device for use with the Valgrind debugging example";
import "io-memory.dml";
extern void *malloc(size_t);
extern int free(void *);
extern char *strcpy(char *, const char *);
method init {
// Memory allocated by "new" expression will be initialized to 0's
// automatically
$too_few = new uint8[100];
}
data uint8 *too_few;
bank regs {
parameter function = 0;
parameter register_size = 1;
// Accesses between 0-99 are okay
// Accesses above 99 are outside of malloc:ed memory
register u[0x100] @ 0x0000 + $i {
method read() -> (value) {
log info: "read from u[%d]", $i;
value = $too_few[$i];
}
}
// Will use uninitialized malloc memory
register m @ 0x0100 {
method read() -> (value) {
log info: "read from m";
local char *s = malloc(10);
log info, 1: "String=%s", s;
value = 0;
free(s);
}
}
// Accessing released memory
register r @ 0x0200 {
method read() -> (value) {
log info: "read from r";
local char *s = new char[10];
strcpy(s, "foo");
delete s;
value = s[0];
}
}
}
Here is a session using this device. Some of the output from Valgrind has been omitted from the example to focus on the important information.
Start with a minimal system containing just our broken device and a memory-space. Map the device into the memory space and try to access it.
simics> @SIM_create_object("memory-space", "phys_mem")
<the memory-space 'phys_mem'>
simics> @SIM_create_object("simple_broken_device_valgrind", "broken")
<the simple_broken_device_valgrind 'broken'>
simics> phys_mem.add-map broken 0x0 0x300
Mapped 'broken' in 'phys_mem' at address 0x0.
simics> phys_mem.read 0 size = 1 # Should be okay
[broken info] read from u[0]
0x0000
simics> phys_mem.read 49 size = 1 # Should be okay
[broken info] read from u[49]
0x0000
We can see that access to the u[0] and u[49]
registers are OK. No complaints from Valgrind.
Now let's try to access outside of malloc'd region:
simics> phys_mem.read 100 size = 1 # Outside of malloc:ed region [broken info] read from u[100] ==7335== ==7335== Invalid read of size 1 ==7335== at 0x116199AD: ::_DML_M_reg__u__read_access(void) (simple-broken-device-valgrind.dml:56) ==7335== by 0x116196CD: _DML_M_reg__read_access (dml-builtins.dml:276) ==7335== by 0x116190E1: _DML_M_reg__access (dml-builtins.dml:258) ==7335== by 0x11618CBC: _DML_M_io_memory__operation (io-memory.dml:30) ==7335== by 0x116184FD: _DML_IFACE_io_memory__operation (io-memory.dml:20) ==7335== by 0x4A96120: VT_io_operation (device.c:54) ==7335== by 0x4ACA907: memory_space_map_access (memory-space.c:762) ==7335== by 0x4ACACEF: memory_space_access (memory-space.c:834) ==7335== by 0x4ACB09A: memory_space_access_simple_inq (memory-space.c:928) ==7335== by 0x4ACB2CD: memory_space_read (memory-space.c:994) ==7335== by 0x4B17DB6: py_code_MPT13conf_object_tPT13conf_object_tT18physical_address_tKintKintRT12attr_value_t (py-wrappers.c:21392) ==7335== by 0x50339BA: PyEval_EvalFrameEx (ceval.c:3564) ==7335== Address 0xBA1365C is 0 bytes after a block of size 100 alloc'd ==7335== at 0x4904B4E: malloc (vg_replace_malloc.c:149) ==7335== by 0x4E40B32: lowlevel_malloc (simmalloc.c:238) ==7335== by 0x4E41139: mm_malloc (simmalloc.c:650) ==7335== by 0x11617EB3: simple_broken_device_valgrind_new_instance
(simple-broken-device-valgrind.dml:40) ==7335== by 0x4A7605B: make_new_instance (configuration.c:1067) ==7335== by 0x4A7CC4E: SIM_create_object (configuration.c:3585) ==7335== by 0x4B3F052: py_code_SIM_create_object (py-wrappers.c:43139) ==7335== by 0x50339BA: PyEval_EvalFrameEx (ceval.c:3564) ==7335== by 0x5035262: PyEval_EvalCodeEx (ceval.c:2831) ==7335== by 0x5035511: PyEval_EvalCode (ceval.c:494) ==7335== by 0x5056008: PyRun_StringFlags (pythonrun.c:1271) ==7335== by 0x502C2C6: builtin_eval (bltinmodule.c:599) 0x0000
Above the u[100] register is accessed but the side-effect
of the register is to read from another allocated region, which is only
100 bytes large.
Now try to access uninitialized data:
simics> phys_mem.read 0x100 size = 1 # Uninitialized data [broken info] read from m ==7335== ==7335== Conditional jump or move depends on uninitialised value(s) ==7335== at 0x4E43307: __vtprintf (vtprintf.c:606) ==7335== by 0x4E4603E: vtvsnprintf (vtprintf.c:861) ==7335== by 0x4E4684C: sb_vaddfmt (strbuf.c:123) ==7335== by 0x4E46A98: sb_vfmt (strbuf.c:163) ==7335== by 0x4ABED34: VT_log_message_fmt_va (log.c:726) ==7335== by 0x4ABEE72: VT_log_message_fmt (log.c:738) ==7335== by 0x11619B36: _DML_M_reg__m__read_access__1 (simple-broken-device-valgrind.dml:64) ==7335== by 0x11619734: _DML_M_reg__read_access (dml-builtins.dml:276) ==7335== by 0x116190E1: _DML_M_reg__access (dml-builtins.dml:258) ==7335== by 0x11618CBC: _DML_M_io_memory__operation (io-memory.dml:30) ==7335== by 0x116184FD: _DML_IFACE_io_memory__operation (io-memory.dml:20) ==7335== by 0x4A96120: VT_io_operation (device.c:54) [broken info] String= 0x0000
Above register m was accessed, but it uses a malloc region which has
not been initialized.
Now try to access free'd data:
simics> phys_mem.read 0x200 size = 1 # Accessing free:d data [broken info] read from r ==7335== ==7335== Invalid read of size 1 ==7335== at 0x11619CA5: _DML_M_reg__r__read_access__2 (simple-broken-device-valgrind.dml:74) ==7335== by 0x11619774: _DML_M_reg__read_access (dml-builtins.dml:276) ==7335== by 0x116190E1: _DML_M_reg__access (dml-builtins.dml:258) ==7335== by 0x11618CBC: _DML_M_io_memory__operation (io-memory.dml:30) ==7335== by 0x116184FD: _DML_IFACE_io_memory__operation (io-memory.dml:20) ==7335== by 0x4A96120: VT_io_operation (device.c:54) ==7335== by 0x4ACA907: memory_space_map_access (memory-space.c:762) ==7335== by 0x4ACACEF: memory_space_access (memory-space.c:834) ==7335== by 0x4ACB09A: memory_space_access_simple_inq (memory-space.c:928) ==7335== by 0x4ACB2CD: memory_space_read (memory-space.c:994) ==7335== by 0x4B17DB6: py_code_MPT13conf_object_tPT13conf_object_tT18physical_address_tKintKintRT12attr_value_t (py-wrappers.c:21392) ==7335== by 0x50339BA: PyEval_EvalFrameEx (ceval.c:3564) ==7335== Address 0x8F8C3A8 is 0 bytes inside a block of size 10 free'd ==7335== at 0x49057C8: free (vg_replace_malloc.c:233) ==7335== by 0x4E409EF: lowlevel_free (simmalloc.c:246) ==7335== by 0x4E41B24: mm_free (simmalloc.c:853) ==7335== by 0x11619CA0: _DML_M_reg__r__read_access__2 (simple-broken-device-valgrind.dml:74) ==7335== by 0x11619774: _DML_M_reg__read_access (dml-builtins.dml:276) ==7335== by 0x116190E1: _DML_M_reg__access (dml-builtins.dml:258) ==7335== by 0x11618CBC: _DML_M_io_memory__operation (io-memory.dml:30) ==7335== by 0x116184FD: _DML_IFACE_io_memory__operation (io-memory.dml:20) ==7335== by 0x4A96120: VT_io_operation (device.c:54) ==7335== by 0x4ACA907: memory_space_map_access (memory-space.c:762) ==7335== by 0x4ACACEF: memory_space_access (memory-space.c:834) ==7335== by 0x4ACB09A: memory_space_access_simple_inq (memory-space.c:928) 0x0066
Register r was accessed, but it uses a memory area which has already been free'd.
Simics contains a very simple system for tracking heap allocations made from C and DML. It is provided "as-is" and may disappear or change without prior notice, as it depends on internal implementation aspects.
Allocations are only tracked when made using one of the macros in the
MM_MALLOC family. Some calls to malloc may also be
tracked if they were made in code including the header
simics/util/alloc.h, but this should not be relied upon.
Any allocation made by other means will not be visible in this system. In particular, this includes anything allocated by Python. Some of the internal allocations made by the simulator itself may also be hidden from view.
The memory tracking makes Simics run more slowly and use more memory,
so it has to be enabled explicitly by setting the environment variable
SIMICS_MEMORY_TRACKING to 1 before starting the simulator.
The mm-list-types, mm-list-sites and mm-list-allocations commands display allocation statistics. In order to access them, they need to be enabled by running
simics> enable-unsupported-feature "malloc-debug"
on the Simics command line. The commands all have an optional parameter to limit the number of results listed, see help for details. The commands mainly differ in the way their output is aggregated and sorted. The columns in the response are:
MM_MALLOC.Reallocations (MM_REALLOC and wrapped realloc)
count as a new allocation followed by freeing the old block.