42 Inspecting and Controlling the Virtual System 44 Connecting to the External World
Model Builder User's Guide  /  VII Extending Simics  / 

43 Memory Tracing and Timing

Simics provides extensive support for tracing and modifying memory transactions coming out of a processor. This chapter aims at describing how to access memory transactions programmatically to write extensions such as trace tools, timing models or cache simulation.

43.1 Tracing Instruction Execution

All processor models in Simics offer an interface that provides a registered listener with all executed instructions. This is used by the trace module, among others, to produce its execution trace.

Registering a function to listen to a trace interface is simple. Assuming that cpu is the traced processor, the following code will register the function trace_listener() to be called for each instruction executed by cpu:

void *data_for_trace_listener = some_data;
const exec_trace_interface_t *iface = 
        SIM_c_get_interface(cpu, EXEC_TRACE_INTERFACE);
iface->register_tracer(cpu, trace_listener, data_for_trace_listener);

Turning off tracing is just as simple:

void *data_for_trace_listener = some_data;
const exec_trace_interface_t *iface =
        SIM_c_get_interface(cpu, EXEC_TRACE_INTERFACE);
iface->unregister_tracer(cpu, trace_listener, data_for_trace_listener);

The listener function itself is expected to be defined as an instruction_trace_callback_t, defined as follow:

typedef void (*instruction_trace_callback_t)(lang_void *tracer_data,
                                             conf_object_t *cpu,
                                             linear_address_t la,
                                             logical_address_t va,
                                             physical_address_t pa,
                                             byte_string_t opcode);

It takes the following arguments:

Tracer functions are not expected to return any value to Simics.

The trace module is provided along with Simics, both as a binary and source code. It is an excellent starting point for developing new tracing modules.

43.2 Tracing Memory Transactions

This section expects the reader to be familiar with memory spaces and how memory accesses are directed to the correct device or memory. More information on memory spaces is available in chapter 23.

43.2.1 Observing Memory Transactions

Memory-spaces provide a memory hierarchy interface for observing and modifying memory transactions passing through them. This interface is in fact composed of two different interfaces acting at different phases of a memory transaction execution:

Both interfaces can be used simultaneously, even by the same object. This property is used by the trace module, which is in fact connected both to the timing_model and the snoop_memory interfaces. The reason for this double connection is explained in section 43.2.4.

Information about implementing these two interfaces is available in section 43.2.6 and section 43.3.2.

43.2.2 Observing Instruction Fetches

For performance reasons, instruction fetches are not sent to the memory hierarchy by default.

Instruction fetches can be activated for each processor with the <cpu>.instruction-fetch-mode command. It can take several values:

Finally, instruction fetch transactions are not generated by all processor models. The section 43.4 contains a summary of which features are available on which models.

43.2.3 Observing Page-table Accesses

For performance reasons, page-table reads are not sent to the memory hierarchy by default on some CPU models. For PPC models with classic MMU, you have to set the mmu_mode attribute to get page-table reads. See the attribute description in the Reference Manual for more information.

43.2.4 Simulator Translation Cache (STC)

In order to improve the speed of the simulation, Simics does not perform all accesses through the memory spaces. The Simulator Translation Caches (STCs) try to serve most memory operations directly by caching relevant information. In particular, an STC is intended to contain the following:

The general idea is that the STC will contain information about "harmless" memory addresses, i.e., addresses where an access would not cause any device state change or side-effect. A particular memory address is mapped by the STC only if:

Memory transactions targeting devices are also mapped by the STC.

The contents of the STCs can be flushed at any time, so models using them to improve speed can not rely on a specific address being cached. They can however let the STCs cache addresses when further accesses to these addresses do not change the state of the model (this is used by cache simulation with g-cache; see the Cache Simulation chapter in the Analyzer User's Guide).

The STCs are activated by default. They can be turned on or off at the command prompt, using the stc-enable/disable functions. An object connected to the timing_model interface can also mark a memory transaction so that it will not be cached by the STCs. For example, the trace module uses that method to ensure that no memory transaction will be cached, so that the trace will be complete.

Note that since information is inserted into the STCs when transactions are executed, only objects connected to the timing model interface can influence the STCs' behavior. The section 43.3 provides a complete description of the changes authorized on a memory transaction when using the memory hierarchy interface.

43.2.5 Summary of Simics Memory System

This diagram puts together the concepts introduced in chapter 23. It describes the path followed by a processor transaction through Simics memory system.

Figure 27. Transaction Path through Simics Memory System

  1. The CPU executes a load instruction.

  2. A memory transaction is created.

  3. If the address is in the STC, the data is read and returned to the CPU using the cached information.

  4. If the address is not in the STC, the transaction is passed along to the CPU memory-space.

  5. If a timing-model is connected to the memory-space, it receives the transaction.

    1. If the timing model returns a non-zero stalling time, the processor is stalled and the transaction will be reissued when the stall time is finished (see also section 43.3.2
    2. If the timing model return a zero stall time, the memory-space is free to execute the transaction.
  6. The memory-space determines the target object (in this example, a RAM object).

  7. The RAM object receives the transactions and executes it.

  8. If possible, the transaction is inserted in the STC.

  9. If a snoop-memory is connected to the memory-space, it receives the transaction.

  10. The transaction is returned to the CPU with the correct data.

Store operations works in the same way, but no data is returned to the CPU.

Simics's memory system is more complex than what is presented here, but from the point of view of a user timing-model or snoop-memory, this diagram explains correctly at which point the main events happen.

43.2.6 Implementing the Interface

The timing_model and snoop_memory contains only one function called operate():

static cycles_t
my_timing_model_operate(conf_object_t         *mem_hier,
                        conf_object_t         *mem_space,
                        map_list_t            *map_list,
                        generic_transaction_t *mem_op);

The four arguments are:

The return value is the number of cycles the transaction should stall before being executed (or reissued). Returning 0 disables all stalling.

43.2.7 Chaining Timing Models

Sometimes it is desirable to chain timing models, e.g., if you are implementing a multi-level cache model and want to model each level of the cache as an individual class. To do this, the operate() function must call the corresponding functions of the lower levels (a lower or next level cache means a cache further away from the CPU, closer to the actual memory).

The g-cache source code included with Simics is an example of how to do this. Whenever there is a miss in the cache, the g-cache object creates a new memory operation and calls the operate() method of the timing_model interface from the next level cache specified by the timing_model attribute.

43.3 Modifying Memory Transactions

43.3.1 Stalling Transactions

The precision of the simulation can be improved by adding timing controls for memory operations: memory-related instructions are no longer atomic operations, but actually take virtual time to execute.

Stalling is controlled via the timing_model interface. The interface simply allows the implementer to return a non-zero number of cycles to stall before the transaction is allowed to progress. During this time, the processor is given back control and lets time advance until the transaction's stall time has elapsed. The transaction is then reissued to the memory system.

Stalling a transaction is not always possible, depending on the processor model you are using in the simulation. The section 43.4 explains what is available for each model.

Cache models, described in the Analyzer User's Guide, are good examples of complex timing models. Finally, the Understanding Simics Timing application note goes into more details in the exact way Simics handles timing and multiprocessor systems.

43.3.2 Changing the Behavior of a Memory Transaction

43.3.2.1 In a Timing Model

An object listening on the timing_model interface is presented with memory transactions before they have been executed, and may therefore change both their semantics and their timing. Here is a list of changes that a timing model is authorized to perform:

If a zero stall time is returned, some additional operations are allowed:

A transaction may go through several memory-spaces in hierarchical order before being executed. Each of these memory-spaces may have a timing-model connected to them. However, if the transaction is stalled by one timing model, other timing models connected to other memory spaces may see the transaction being reissued before it is executed. It is not supported to return a non zero stall time from these other timing models, that is, a transaction may be stalled by at most one timing model.

43.3.2.2 In a Snoop Device

An object listening on the snoop_memory interface is presented with memory transactions after they have completed. It cannot influence the execution of the operation and it may not return a non-zero value for stalling, but it is allowed to modify the value of the memory operation. Since the data returned by read operations are available at this stage, the snoop device is also an ideal place to trace memory transactions. Note that if you want to modify the properties of the memory transaction, such as future visibility and reissue, you have to do that in a timing_model interface operate function.

The following actions are allowed:

43.4 Memory Features Availability

All types of cache modeling features are not supported by all processor types. The instrumentation API need to be supported in order to do cache modeling for a specific processor.

Currently ARC, ARM, MIPS, PPC, X86 and Xtensa target architectures support instrumentation.

42 Inspecting and Controlling the Virtual System 44 Connecting to the External World