2.6 Standard Device Model 2.8 Foreign Threads
API Reference Manual  /  2 Threading Model  / 

2.7 Threaded Device Model

An object using one of the concurrency modes Sim_Concurrency_Mode_Serialized_Memory or Sim_Concurrency_Mode_Full is called a thread-aware model. The Threaded Device Model must be followed by such objects.

Thread-aware models run mostly in Threaded Context.

This section primarily discusses thread-aware models, but much of the contents also applies to code invoked directly from a "foreign" thread.

Note: A CPU is the typical example of a thread-aware model. Most devices should rather use the Standard Device Model.

2.7.1 Programming Model

Thread-aware models need to take the following into account:

Incoming Interface Calls
Interfaces implemented by thread-aware models can be invoked in Threaded Context rather than Cell Context, and the thread domain associated with the object cannot be assumed to be held on entry.

It is the responsibility of the model to ensure that its state is protected, usually by calling SIM_ACQUIRE_OBJECT from its interface methods, as in the following example:

      static void
      some_interface_method(conf_object *obj)
      {
          domain_lock_t *lock;
          SIM_ACQUIRE_OBJECT(obj, &lock);
          /* ... internal state is protected by the TD ... */
          SIM_RELEASE_OBJECT(obj, &lock);
      }
    

No extra protection is needed for interfaces which are only available in OEC. All thread domains are already held on entry.

Note: There are a few situations when the model is invoked with its thread domain already held:
  • The run method of the execute interface is invoked with the object's thread domain held. The model should not acquire the domain again, since this would block the signaling mechanism used to notify the model when another thread tries to acquire the domain.
  • The methods in the the direct_memory_update interface are always invoked with the thread domain held.
Outgoing Interface Calls
When a thread-aware model invokes an interface method on an object which is not known to reside in the same thread domain, then the call must be protected with SIM_ACQUIRE_TARGET, with the interface object provided as an argument. This ensures that Cell Context is entered, when necessary.

Example of an "outgoing" interface call:

      domain_lock_t *lock;
      /* incoming interface calls may occur here */
      SIM_ACQUIRE_TARGET(target_obj, &lock);
      some_interface->some_method(target_obj, ...);
      SIM_RELEASE_TARGET(target_obj, &lock);
    

Note: If the target object is thread-aware, then SIM_ACQUIRE_TARGET will actually be a no-op.
Note: If the cell TD is busy when SIM_ACQUIRE_TARGET is executed, then the model may see incoming interface calls while waiting for the domain, since all held domains are temporarily released while waiting.
API Calls
Cell Context must be entered before any API function can be called which requires this context. The context is entered with the SIM_ACQUIRE_CELL primitive, as in this example:
      domain_lock_t *lock;
      /* incoming interface calls may occur here */
      SIM_ACQUIRE_CELL(obj, &lock);
      /* this code runs in Cell Context */
      breakpoint_id = SIM_breakpoint(...);
      SIM_RELEASE_CELL(obj, &lock);
    

Some functions that need this protection:

There are, however, many functions that can be called directly in Threaded Context, e.g.

Some API functions can be called directly, as long as the TD has been acquired for the object in question:
Callbacks
Callbacks triggered by the model are often expected to be dispatched in Cell Context. The model must enter Cell Context using SIM_ACQUIRE_CELL before dispatching such callbacks.

Note: Events registered with the Sim_Event_No_Serialize flag and callbacks used by the CPU instrumentation framework do not need to be protected. For these callbacks, it is the callee's responsibility to be aware that the context can be more limited than Cell Context. This is a performance optimization to allow fast callbacks with minimal overhead.
Attributes
Registered attribute setters and getters are automatically protected; an object's thread domain is always held when attribute setters and getters are invoked.

Note: Attributes should be used for configuration and to hold state. Attributes should never be used for communication between objects during simulation.

2.7.2 Domain Boundary Crossings

Whenever a thread-domain boundary is crossed, already held domains may temporarily be released to avoid deadlock situations. This allows unrelated, incoming, interface calls to occur at such points.

A thread-aware model must ensure that potential state changes caused by incoming interface calls are taken into account. This is one of the challenging points when writing a thread-aware model.

In Cell Context, boundary crossings are not an issue, since this context is prioritized exactly to avoid unexpected interface calls. Thread-aware models, running in Threaded Context, are not as fortunate and need to be aware of the possibility.

It is recommended that incoming interface calls are kept as simple as possible for thread-aware models. If possible, the interface action should be deferred and handled from an inner loop, especially for CPUs. For instance, a RESET interface should not perform the reset immediately, but instead set a flag that a reset should be performed before dispatching the next instruction.

2.7.3 Mixing Thread Domains and Mutexes

It is easy to run into problems when different locking schemes are combined. This is also the case when mixing mutexes and thread domains. The following examples illustrate some pitfalls:

Example 1
Acquiring a thread domain while holding a lock:
      Thread 1                  Thread 2
      Locks Mutex1              Acquires TD1
      Acquires TD1 (blocks)     Locks Mutex1 (blocks)
    

Thread 1 will never be able to acquire TD1 since this domain is held by thread 2 which blocks on Mutex1.

Note that the above example will also cause a deadlock if two mutexes are used rather than one mutex and one thread domain:

      Thread 1                  Thread 2
      Locks Mutex1              Locks Mutex2
      Locks Mutex2 (blocks)     Locks Mutex1 (blocks)
    

Whereas no deadlock occurs with two thread domains:

      Thread 1                  Thread 2
      Acquires TD2              Acquires TD1
      Acquires TD1*             Acquires TD2*
       *Not a deadlock - Simics detects and resolves this situation
    

Example 2
Waiting for a condition variable while holding a thread domain:
      Thread 1               Thread 2
      Acquires TD1           .
      Waits for COND1	     Acquires TD1 (blocks)
                             Releases TD1 (not reached)
	                     .
                             Signals COND1 (not reached)
    

Sleeping on a condition while holding a thread domain easily leads to deadlocks. Threads requiring the thread domain will get stuck and potentially prevent the condition from being signaled.

In practice, code can seldom make assumptions about which thread domains are held. For instance, an interface function can be invoked with an unknown set of thread domains already acquired. The domain retention mechanism also makes the picture more complex.

To avoid deadlocks, the following general principles are encouraged:

When needed, it is possible to drop all thread domains, which is illustrated in the following example:
  domain_lock_t *lock;
  SIM_DROP_THREAD_DOMAINS(&lock);
  /* no thread domains are held here... */
  SIM_REACQUIRE_THREAD_DOMAINS(&lock);

Note: Avoid empty drop/reacquire pairs. If the intention is allowing other objects to access held domains, then SIM_yield_thread_domains should be used instead. The yield function, besides being faster, guarantees that all waiting threads are given an opportunity to acquire the held domains.

2.7.4 Thread-Aware CPUs

Thread-aware CPUs have a few extra things to consider.

Execution
CPU models are driven from the run method of the execute interface. The method is invoked in Threaded Context, with the CPU thread domain already held.

The thread calling run is a simulation thread managed by the Simics scheduler. It is possible that this thread is used to simulate more than one model.

The model is not guaranteed that the run function is always invoked by the same thread.

Signaling
Whenever another CPU, or a device model, tries to acquire the CPU domain, the CPU is notified through the execute_control interface.

When a CPU is signaled in this way, it should as soon as possible call SIM_yield_thread_domains. The yield function ensures that pending direct memory update messages are delivered and allows other threads to invoke interfaces on the CPU object.

The signaling methods are invoked asynchronously, and the implementation must not acquire any thread domains or call API functions.

The signaling only occurs when the CPU's thread domain is the only domain held. Acquiring an additional domain, even the already held domain, temporarily blocks the signaling mechanism. Due to this, it is important that the CPU thread domain is not acquired in the run method, since it is already held on entry.

Note: To minimize the waiting time for other threads, it is important that the signaling is detected quickly.
Direct Memory
The methods of the direct_memory_update interface are invoked with the CPU thread domain already acquired. The model should service requests quickly and without acquiring additional thread domains.

2.7.5 Thread Domain Contention

Statistics about thread-domain acquisition can be collected with the enable-object-lock-stats command. This functionality is useful when a model is optimized to avoid unnecessary thread-domain crossings or to investigate thread-domain contention.

There is a definite overhead associated with collecting the statistics; it should not be turned on by default.

The collected statistics can be shown with the print-object-lock-stats command:

┌─────┬───────┬─┬────────────────────────────┬─────────────────────────────────┐
│Count│Avg(us)│ │          Function          │               File              │
├─────┼───────┼─┼────────────────────────────┼─────────────────────────────────┤
│  396│   1.94│ │get_cycles                  │core/clock/clock.c:172           │
│  369│   1.91│ │post                        │core/clock/clock.c:254           │
│   27│   2.00│C│handle_event                │core/clock/clock-src.c:211       │
│   12│   2.33│ │pb_lookup                   │core/common/image.c:3965         │
│    7│   2.86│C│cpu_access                  │cpu/cpu-common/memory.c:508      │
│    8│   2.38│C│perform_io                  │cpu/x86/x86-io.c:128             │
│    6│   1.83│ │dml_lookup                  │core/common/memory-page.c:431    │
│    3│   3.00│C│call_hap_functions_serialize│core/common/hap.c:1410           │
│    3│   2.33│ │cancel                      │core/clock/clock.c:280           │
└─────┴───────┴─┴────────────────────────────┴─────────────────────────────────┘

The command basically displays the location in the source where thread domains have been acquired, and how quickly the domains were acquired. The 'C' indicates that Cell Context was entered.

2.6 Standard Device Model 2.8 Foreign Threads