This chapter covers various topics related to Simics performance and what can be done to measure and improve it. It discusses the general performance features provided by Simics. For ways to scale the simulation even further see chapter 16.
Simics is a fast simulator utilizing various techniques such as run-time code generation to optimize performance. In some cases Simics can execute code faster than the target system being simulated, while it can also be considerably slower in other cases.
There are four major execution modes Simics uses to execute target instructions: hypersimulation, VMP, JIT and interpreted mode.
Hypersimulation means that Simics detects repetitive work performed by the target code and performs the effects of the code without actually having to run the code. In the most simple case this is a simple idle loop, but it can also be applied to more complex examples such as spin-locks and device polling. This is the fastest execution mode.
VMP, which is a part of Simics's x86 models, utilizes the virtualization capabilities of modern processors to run target instructions directly. This typically results in high simulation performance, but the host and target needs have the same instruction set, and you have to do special set up to enable it. VMP is currently only supported on x86 hosts.
JIT mode uses run-time code generation to translate blocks of the target instructions into blocks of host instructions. JIT mode is when Simics runs such translated blocks. This mode is supported by most target processor models in Simics.
Interpreted mode interprets the target instructions one by one. This mode is the slowest, but it is always available.
There are basically two ways to measure Simics performance:
If we divide the wall-clock time on the host that Simics executes on, with the elapsed virtual time on the target machine, we get a slowdown number.
simics> board.mb.cpu0.core[0][0].set-step-rate ipc = 1.5 Setting step rate to 3/2 steps/cycle simics> board.mb.cpu0.core[0][0].set-step-rate ipc = 0.5 Setting step rate to 1/2 steps/cycle
In the first example, IPC of 1.5 means that Simics needs to execute 3 instructions for 2 cycles to elapse. In the second example, for each instruction executed two cycles elapse. Thus, with a lower IPC value, virtual time will progress faster and simulation slowdown will decrease.
Note that there is nothing wrong in changing the default IPC when it comes to the accuracy of the simulation. In many cases, the IPC observed for a given benchmark is much lower than the 1.0 that Simics assumes, and matching it will both make the simulation closer to the real hardware and improve the simulation speed, at least in virtual time. Simulations that profits most from this change are simulations involving devices and long memory latencies.