An object using one of the concurrency modes
Sim_Concurrency_Mode_Serialized_Memory
or
Sim_Concurrency_Mode_Full
is called a thread-aware
model. The Threaded Device Model must be followed by such objects.
Thread-aware models run mostly in Threaded Context.
This section primarily discusses thread-aware models, but much of the contents also applies to code invoked directly from a "foreign" thread.
Thread-aware models need to take the following into account:
It is the responsibility of the model to ensure that its state is protected, usually by calling SIM_ACQUIRE_OBJECT from its interface methods, as in the following example:
static void some_interface_method(conf_object *obj) { domain_lock_t *lock; SIM_ACQUIRE_OBJECT(obj, &lock); /* ... internal state is protected by the TD ... */ SIM_RELEASE_OBJECT(obj, &lock); }
No extra protection is needed for interfaces which are only available in OEC. All thread domains are already held on entry.
execute
interface is invoked with the object's thread domain held.
The model should not acquire the domain again, since this
would block the signaling mechanism used to notify the model
when another thread tries to acquire the domain.
direct_memory_update
interface are always invoked with the thread domain held.
Example of an "outgoing" interface call:
domain_lock_t *lock; /* incoming interface calls may occur here */ SIM_ACQUIRE_TARGET(target_obj, &lock); some_interface->some_method(target_obj, ...); SIM_RELEASE_TARGET(target_obj, &lock);
domain_lock_t *lock; /* incoming interface calls may occur here */ SIM_ACQUIRE_CELL(obj, &lock); /* this code runs in Cell Context */ breakpoint_id = SIM_breakpoint(...); SIM_RELEASE_CELL(obj, &lock);
Some functions that need this protection:
There are, however, many functions that can be called directly in Threaded Context, e.g.
Sim_Event_No_Serialize
flag
and callbacks used by the CPU instrumentation framework do not need to be
protected. For these callbacks, it is the callee's responsibility to be
aware that the context can be more limited than Cell Context.
This is a performance optimization to allow fast callbacks with
minimal overhead.
Whenever a thread-domain boundary is crossed, already held domains may temporarily be released to avoid deadlock situations. This allows unrelated, incoming, interface calls to occur at such points.
A thread-aware model must ensure that potential state changes caused by incoming interface calls are taken into account. This is one of the challenging points when writing a thread-aware model.
In Cell Context, boundary crossings are not an issue, since this context is prioritized exactly to avoid unexpected interface calls. Thread-aware models, running in Threaded Context, are not as fortunate and need to be aware of the possibility.
It is recommended that incoming interface calls are kept as simple as possible for thread-aware models. If possible, the interface action should be deferred and handled from an inner loop, especially for CPUs. For instance, a RESET interface should not perform the reset immediately, but instead set a flag that a reset should be performed before dispatching the next instruction.
It is easy to run into problems when different locking schemes are combined. This is also the case when mixing mutexes and thread domains. The following examples illustrate some pitfalls:
Thread 1 Thread 2 Locks Mutex1 Acquires TD1 Acquires TD1 (blocks) Locks Mutex1 (blocks)
Thread 1 will never be able to acquire TD1 since this domain is held by thread 2 which blocks on Mutex1.
Note that the above example will also cause a deadlock if two mutexes are used rather than one mutex and one thread domain:
Thread 1 Thread 2 Locks Mutex1 Locks Mutex2 Locks Mutex2 (blocks) Locks Mutex1 (blocks)
Whereas no deadlock occurs with two thread domains:
Thread 1 Thread 2 Acquires TD2 Acquires TD1 Acquires TD1* Acquires TD2* *Not a deadlock - Simics detects and resolves this situation
Thread 1 Thread 2 Acquires TD1 . Waits for COND1 Acquires TD1 (blocks) Releases TD1 (not reached) . Signals COND1 (not reached)
Sleeping on a condition while holding a thread domain easily leads to deadlocks. Threads requiring the thread domain will get stuck and potentially prevent the condition from being signaled.
In practice, code can seldom make assumptions about which thread domains are held. For instance, an interface function can be invoked with an unknown set of thread domains already acquired. The domain retention mechanism also makes the picture more complex.
To avoid deadlocks, the following general principles are encouraged:
domain_lock_t *lock; SIM_DROP_THREAD_DOMAINS(&lock); /* no thread domains are held here... */ SIM_REACQUIRE_THREAD_DOMAINS(&lock);
Thread-aware CPUs have a few extra things to consider.
execute
interface. The method
is invoked in Threaded Context, with the CPU thread
domain already held.
The thread calling run is a simulation thread managed by the Simics scheduler. It is possible that this thread is used to simulate more than one model.
The model is not guaranteed that the run function is always invoked by the same thread.
execute_control
interface.
When a CPU is signaled in this way, it should as soon as possible call SIM_yield_thread_domains. The yield function ensures that pending direct memory update messages are delivered and allows other threads to invoke interfaces on the CPU object.
The signaling methods are invoked asynchronously, and the implementation must not acquire any thread domains or call API functions.
The signaling only occurs when the CPU's thread domain is the only
domain held. Acquiring an additional domain, even the already held domain,
temporarily blocks the signaling mechanism. Due to this, it is
important that the CPU thread domain is not acquired in the
run
method, since it is already held on entry.
direct_memory_update
interface
are invoked with the CPU thread domain already acquired.
The model should service requests quickly and without
acquiring additional thread domains.
Statistics about thread-domain acquisition can be collected with the enable-object-lock-stats command. This functionality is useful when a model is optimized to avoid unnecessary thread-domain crossings or to investigate thread-domain contention.
There is a definite overhead associated with collecting the statistics; it should not be turned on by default.
The collected statistics can be shown with the print-object-lock-stats command:
┌─────┬───────┬─┬────────────────────────────┬─────────────────────────────────┐ │Count│Avg(us)│ │ Function │ File │ ├─────┼───────┼─┼────────────────────────────┼─────────────────────────────────┤ │ 396│ 1.94│ │get_cycles │core/clock/clock.c:172 │ │ 369│ 1.91│ │post │core/clock/clock.c:254 │ │ 27│ 2.00│C│handle_event │core/clock/clock-src.c:211 │ │ 12│ 2.33│ │pb_lookup │core/common/image.c:3965 │ │ 7│ 2.86│C│cpu_access │cpu/cpu-common/memory.c:508 │ │ 8│ 2.38│C│perform_io │cpu/x86/x86-io.c:128 │ │ 6│ 1.83│ │dml_lookup │core/common/memory-page.c:431 │ │ 3│ 3.00│C│call_hap_functions_serialize│core/common/hap.c:1410 │ │ 3│ 2.33│ │cancel │core/clock/clock.c:280 │ └─────┴───────┴─┴────────────────────────────┴─────────────────────────────────┘
The command basically displays the location in the source where thread domains have been acquired, and how quickly the domains were acquired. The 'C' indicates that Cell Context was entered.