SIM_INTERFACE(decoder) { void (*register_decoder)(conf_object_t *obj, decoder_t *NOTNULL decoder); void (*unregister_decoder)(conf_object_t *obj, decoder_t *NOTNULL decoder); };
The decoder
interface is implemented by processors
that allows connecting user decoders. This allows a user to
implement the semantics of instructions that are not available in
the standard Simics model or change the semantics of instructions
implemented by Simics. This interface replaces
SIM_register_arch_decoder and
SIM_unregister_arch_decoder functions.
The register_decoder function adds a decoder and unregister_decoder removes a decoder.
The decoder is installed/removed for every object of the same class as the obj argument which must be the same object from which the interface was fetched.
When Simics decodes an instruction, it will first see if any instruction decoders are registered for the current CPU class. For any decoders it finds, Simics will let it try to decode the instruction. The decoders are called in order, starting with the last registered decoder, and if one decoder accepts the instruction, the rest of the decoders will not be called.
The decoder is specified by the decoder_t
data structure that the
user supplies:
typedef struct { void *user_data; int (*NOTNULL decode)(uint8 *code, int valid_bytes, conf_object_t *cpu, instruction_info_t *ii, void *user_data); tuple_int_string_t (*NOTNULL disassemble)(uint8 *code, int valid_bytes, conf_object_t *cpu, void *user_data); int (*NOTNULL flush)(instruction_info_t *ii, void *user_data); } decoder_t;
The decode function is called to decode an instruction pointed to by code. The first byte corresponds to the lowest address of the instruction in the simulated memory. valid_bytes tells how many bytes can be read. The CPU is given in the cpu parameter. When the decoder has successfully decoded an instruction, it should set the ii_ServiceRoutine, the ii_Arg, and the ii_Type members of the ii structure (see below), and returns the number of bytes used in the decoding. If it does not apply to the given instruction, it should return zero. If the decoder needs more data than valid_bytes it should return a negative number corresponding to the total number of bytes it will need to continue the decoding. The underlying architecture limits the number of bytes that can be requested, e.g. no more than 4 bytes can be requested on most RISC architectures. Simics will call the decoder again when more bytes are available. This process is repeated until the decoder accepts or rejects the instruction. A decoder should never request more data than it needs. For example, if an instructions can be rejected by looking at the first byte, the decoder should never ask for more bytes.
The instruction_info_t
is defined as follows:
typedef struct instruction_info { service_routine_t ii_ServiceRoutine; uint64 ii_Arg; unsigned int ii_Type; lang_void *ii_UserData; logical_address_t ii_LogicalAddress; physical_address_t ii_PhysicalAddress; } instruction_info_t;
ii_ServiceRoutine is a pointer to a function that will be called by Simics every time the instruction is executed. It has the following prototype:
typedef exception_type_t (*service_routine_t)(conf_object_t *cpu, uint64 arg, lang_void *user_data);
The service routine function should return an exception when it is
finished to signal its status. If no exception occurs
Sim_PE_No_Exception
should be returned.
See exception_type_t
in
src/include/simics/base/memory.h
for the different
exceptions available.
A special return value, Sim_PE_Default_Semantics
, can be
returned; this signals Simics to run the default semantics for the
instruction. This is useful if the semantics of an instruction
should be changed but the user routine does not want to handle it all
the time.
Note that in a shared memory multiprocessor, the CPU used in decoding may differ from the CPU that executes the instruction, since the decoded instructions may be cached.
ii_Arg is the argument arg that will be passed on to the service routine function. Op code bit-fields for the instruction such as register numbers or intermediate values can be stored here. The ii_UserData field can also be used to pass information to the service routine if more data is needed.
ii_Type is either UD_IT_SEQUENTIAL
or
UD_IT_CONTROL_FLOW
. A sequential type means that the
instruction does not perform any branches and the update of the
program counter(s) is handled by Simics. In a control flow
instruction on the other hand it is up to the user to set the
program counter(s).
ii_LogicalAddress and ii_PhysicalAddress holds the logical and physical addresses of the instruction to be decoded.
The disassemble function is called to disassemble an
instruction. It uses the same code,
valid_bytes, and cpu parameters as
the decode function. If the disassembly is valid, then
the string part of the returned tuple_int_string_t
struct
should be a MALLOCed string with the disassembly and the integer
part should be its length in bytes. The caller is responsible for
freeing the disassembly string. The string member should be NULL
and the integer part should be zero if the disassembly is not
valid. If the disassemble function needs more data than
valid_bytes it should return a negative number in
the integer part in the same way as the decode function,
and set the string part to NULL.
The flush function is called to free any memory
allocated when decoding an instruction and any user data associated
with the instruction. It should return zero if it does not
recognize the instruction, and non-zero if it has accepted it.
Usually, the way to recognize if a decoded instruction is the right
one to flush is to compare ii->ii_ServiceRoutine
with the
function that was set in the decode function. Note
that the cpu parameter is the processor that caused
the flush. It is more or less an arbitrary processor and should be
ignored.
In addition to the function pointers, the
decoder_t
structure contains a
user_data pointer that is passed to all the
functions. This can be used for passing any data to the decoder
functions.