2 The Object Model 4 Libraries and Built-ins

Device Modeling Language 1.2 Reference Manual /

3 DML 1.2

This chapter describes the DML language, version 1.2. It will help to have read and understood the object model in the previous chapter before reading this chapter.

3.1 Overview

DML is not a general-purpose programming language, but a modeling language, targeted at writing Simics device models. The algorithmic part of the language is an extended subset of ISO C; however, the main power of DML is derived from its simple object-oriented constructs for defining and accessing the static data structures that a device model requires, and the automatic creation of bindings to Simics.

Furthermore, DML provides syntax for bit-slicing, which much simplifies the manipulation of bit fields in integers; new and delete operators for allocating and deallocating memory; a basic try/throw mechanism for error handling; built-in log and assert statements; and a powerful metaprogramming facility using templates, method inlining, and compile-time evaluation for shaping the generated code.

Most of the built-in Simics-specific logic is implemented directly in DML, in standard library modules that are automatically imported; the dmlc compiler itself contains as little knowledge as possible about the specifics of Simics.

3.2 Lexical Structure

For the most part, the lexical structure of DML resembles that of C. However, DML distinguishes between "object context" and "C context", so that some C keywords such as register, signed and unsigned, are allowed as identifiers in object context, while on the other hand many words that look like keywords in the object context, such as bank, event and data, are in fact allowed as identifiers in all contexts.

Another major difference from C is that names do not generally need to be defined before their first use. This is quite useful, but might sometimes appear confusing to C programmers.

Character encoding

DML source files are written using UTF-8 encoding. Non-ASCII characters are only allowed in comments and in string literals. String values are still handled as byte arrays, which means that a string value written with a literal of three characters may actually create an array of more than three bytes.

Reserved words

All ISO/ANSI C reserved words are reserved words in DML (even if currently unused). In addition, the C99 and C++ reserved words restrict, inline, this, new, delete, throw, try, catch, and template are also reserved in DML. The C++ reserved words class, namespace, private, protected, public, using, and virtual, are reserved in DML for future use; as are identifiers starting with an underscore (_).

The following words are reserved specially by DML: after, assert, call, cast, defined, error, foreach, in, is, local, log, parameter, select, sizeoftype, typeof, undefined, vect, and where.

Identifiers

Identifiers in DML are defined as in C; an identifier may begin with a letter or underscore, followed by any number of letters, numbers, or underscores. Identifiers that begin with an underscore (_) are reserved by the DML language and standard library and should not be used. Within C context, unqualified object-context identifiers are always prefixed by a $ character.

Constant Literals

DML has literals for strings, characters, integers, booleans, and floating-point numbers. The integer literals can be written in decimal (01234), hexadecimal (0x12af), or binary (0b110110) form.

Underscores (_) can be used between digits, or immediately following the 0b, 0x prefixes, in integer literals to separate groups of digits for improved readability. For example, 123_456, 0b10_1110, 0x_eace_f9b6 are valid integer constants, whereas _78, 0xab_ are not.

String literals are surrounded by double quotes ("). To include a double quote or a backslash (\) in a string literal, precede them with a backslash (\" and \\, respectively). Newline, carriage return, tab and backspace characters are represented by \n, \r, \t and \b. Arbitrary byte values can be encoded as \x followed by exactly two hexadecimal digits, such as \x1f. Such escaped byte values are restricted to 00-7f for strings containing Unicode characters above U+007F.

Character literals consist of a pair of single quotes (') surrounding either a single printable ASCII character or one of the escape sequences \', \\, \n, \r, \t or \b. The value of a character literal is the character's ASCII value.

Comments

C-style comments are used in DML. This includes both in-line comments (/``*...*``/) and comments that continue to the end of the line (//...).

3.3 Module System

DML employs a very simple module system, where a module is any source file that can be imported using the import directive. Such files may not contain a device declaration, but otherwise look like normal DML source files. The imported modules are merged into the main program as if all the code was contained in a single file (with some exceptions). This is similar to C preprocessor #include directives, but in DML each imported file must be possible to parse in isolation, and may contain declarations (such as bitorder) that are only effective for that file.

3.4 Source File Structure

A DML source file describes both the structure of the modeled device and the actions to be taken when the device is accessed.

A DML source file defining a device starts with a language version declaration and a device declaration. After that, any number of parameter declarations, methods, data fields, object declarations, or global declarations can be written. A DML file intended to be imported (by an import statement in another DML file) has the same layout except for the device declaration.

3.4.1 Language Version Declaration

Every DML source file should contain a version declaration, on the form "dml m.n;", where m and n are nonnegative integers denoting the major and minor version of DML, respectively, that the source file is written in. The version declaration allows the dmlc compiler to select the proper versions of the DML parser and standard libraries to be used for the file. (Note that each file has its own individual language version, even if it is imported by a file using another version.) A file should not import a file with a higher language version than its own.

The version declaration must be the first declaration in the file, possibly preceded by comments. For example:

// My Device
dml 1.2;
...

This document describes DML 1.2. The compiler does not compile files written in DML 1.0.

3.4.2 Device Declaration

Every DML source file that contains a device declaration is a DML program, and defines a device class with the specified name. Such a file may import other files, but only the initial file may contain a device declaration.

The device declaration must be the first proper declaration in the file, only preceded by comments and the language version declaration. For example:

/*
 *  My New Device
 */
dml 1.2;
device my_device;
...

3.4.3 Source File Example

The following is an example of a small DML program defining a very simple device. This lacks many details that would appear in a real device.

dml 1.2;

device excalibur;

connect bus {
    interface pci;
}

bank config_registers {
    register cfg1 size 4 @ 0x0000 {
        field status {
            method read { ... }
            method write { ... }
        }
        field enable {
            method read { ... }
            method write { ... }
        }
    }
}

bank databank {
    register r1 size 4 @ 0x0000 {
        field f1 {
            method read { ... }
            method write { ... }
        }
    }
    register r2 size 4 @ 0x0004 {
        field f2 {
            method read { ... }
            method write { ... }
        }
    }
}

3.5 Parameter Declarations

A parameter declaration has the general form "parameter name specification;", where specification is either "= expr" or "default expr". For example:

parameter offset = 8;
parameter byte_order default "little-endian";

A default value is overridden by an assignment (=). There can be at most one assignment, and at most one default value, for each parameter. Typically, a default value for a parameter is specified in a template, and the programmer may then choose to override it where the template is used. See Section 3.11.2 for more information about templates.

The specification part is in fact optional; if omitted, it means that the parameter is declared to exist (and must be given a value, or the program will not compile). This is sometimes useful in templates, as in:

template constant {
    parameter value;
    method get -> (v) {
        v = $value;
    }
}

so that wherever the template constant is used, the programmer is also forced to define the parameter value. E.g.:

register r0 size 2 @ 0x0000 is (constant) {
    parameter value = 0xffff;
}

Note that simply leaving out the parameter declaration from the template definition can have unwanted effects if the programmer forgets to specify its value where the template is used. At best, it will only cause a more obscure error message, such as "unknown identifier"; at worst, the scoping rules will select an unrelated definition of the same parameter name.

You may see the following special form in some standard library files:

parameter name auto;

for example,

parameter parent auto;

This is used to explicitly declare the built-in automatic parameters, and should never be used outside the libraries.

3.6 Data types

The type system in DML builds on the type system in C, with a few modifications. There are eight kinds of data types. New names for types can also be assigned using a typedef declaration.

Integers

Integer types guarantee a certain minimum bit width and may be signed or unsigned. The basic integer types are named uint1, uint2, ..., uint64 for the unsigned types, and int1, int2, ..., int64 for the signed types. Note that the size of the integer type is only a hint and the type is guaranteed to be able to hold at least that many bits. Assigning a value that would not fit into the type is undefined, thus it is an error to assume that values will be truncated. For bit-exact types, refer to bitfields and layout.

The familiar integer types char and int are available as aliases for int8 and int32, respectively. The C keywords short, signed, unsigned and long are reserved words in DML and not allowed in type declarations. The types size_t and uintptr_t are defined as in C.

Endian integers

Endian integer types hold similar values as integer types, but in addition have the following attributes:

They are guaranteed to be stored in the exact number of bytes required for their bitsize, without padding.
They have a defined byte order.
They have a natural alignment of 1 byte.

Endian integer types are named after the integer type with which they share a bitsize and sign but in addition have a _be_t or _le_t suffix, for big-endian and little-endian integers, respectively. So for example; uint8_be_t, uint8_le_t, int16_be_t, uint24_le_t are all endian integer types. These types can be transparently used interchangeably with regular integer types, values of one type will be coerced to the other as needed. Note that operations on integers will always produce regular integer types, even if all operands are of endian integer type. Assigning a value that would not fit into the type is undefined, thus it is an error to assume that values will be truncated.

Floating-point numbers

There is only one floating-point type, called double. Note that DMLC currently permits float as a type, but this is not officially supported and we may change some details in the semantics later. We generally recommend the use of double instead.

Booleans

The boolean type bool has two values, true and false.

Arrays

An array is a sequence of elements of another type, and works as in C.

Pointers

Pointers to types, work as in C. String literals have the type const char *. A pointer has undefined meaning if the pointer target type is an integer whose bit-width is neither 8, 16, 32, nor 64.

Structures

A struct type defines a composite type that contains named members of different types. DML makes no assumptions about the data layout in struct types, but see the layout types below for that. Note that there is no struct label as in C, and struct member declarations are permitted to refer to types that are defined further down in the file. Thus, new struct types can always be declared using the following syntax:

typedef struct { member declarations } name;

Layouts

A layout is similar to a struct in many ways. The important difference is that there is a well-defined mapping between a layout object and the underlying memory representation, and layouts may specify that in great detail.

A basic layout type looks like this:

layout "big-endian" {
    uint24 x;
    uint16 y;
    uint32 z;
}

By casting a pointer to a piece of host memory to a pointer of this layout type, you can access the fourth and fifth byte as a 16-bit unsigned integer with big-endian byte order by simply writing p->y.

The allowed types of the layout members are integers, other layout types, bitfields (see below), and arrays of these.

The byte order declaration is mandatory, and is either "big-endian" or "little-endian".

Access of layout members do not always provide a value of the type used for the member in the declaration. Bitfields and integer members (and arrays of similar) are translated to endian integers of similar size, with endianness matching the layout (or arrays of such). Layout and endian integer members are accessed normally.

Pointer arithmetic on layout members gives undefined behavior.

Bitfields

A bitfield type works similar to an integer type where you use bit slicing to access individual bits, but where the bit ranges are assigned names. A bitfields declaration looks like this:

bitfields 32 {
    uint3  a @ [31:29];
    uint16 b @ [23:8];
    uint7  c @ [7:1];
    uint1  d @ [0];
}

The bit numbering is determined by the bitorder declaration in the current file.

Accessing bit fields is done as with a struct or layout, but the whole bitfield can also be used as an unsigned integer. See the following example:

local bitfields 32 { uint8 x @ [7:0] } bf;
bf = 0x000000ff;
bf.x = bf.x - 1;
local uint32 v = bf;

3.7 Methods

Methods are similar to C functions, but also have an implicit (invisible) parameter which allows them to refer to the current device instance, i.e., the Simics configuration object representing the device. Methods also support exception handling in DML, using try and throw. The body of the method is a compound statement in an extended subset of C; see also Section x It is an error to have more than one method declaration using the same name within the same scope.

3.7.1 Input and Output Parameters

In addition to the input parameters, a DML method can have any number of output parameters, in contrast to C functions which have at most one return value. DML methods do not use the keyword void - an empty pair of parentheses always means "zero parameters", and can even be omitted. Apart from this, the parameter declarations of a method are ordinary C-style declarations.

For example,

method m1 {...}

and

method m1() {...}

are equivalent, and define a method that takes no input parameters and returns nothing.

method m2(int a) -> () {...}

defines a method that takes a single input parameter, and also returns nothing.

method m3(int a, int b) -> (int c) {
  ...;
  c = ...;
}

defines a method with two input parameters and a single output parameter. Output parameters must be explicitly assigned before the method returns.

method m4() -> (int c, int d) {
  ...;
  c = ...;
  d = ...;
  return;
}

has no input parameters, but two output parameters. Ending the method body with a return statement is optional.

If it is statically known that a method can not throw an exception, then it may be declared as nothrow:

method m5(int x) -> (int y) nothrow {
    y = x * x;
}

This allows the method to be called as an expression, without a call statement:

log info: "m5(4) is %d", $m5(4);

If a method is declared nothrow, then any throwing statement inside the method must be enclosed in a try block. A throw statement counts as throwing, as does a call or inline statement, unless the method it invokes is declared nothrow.

Note: In DML 1.4, nothrow is the default, and methods that can throw an exception must be explicitly declare that with a throws annotation.

3.7.2 Default Methods

A method can be declared as the default implementation with the default keyword:

method m() default {...}

A default method can be overridden once by defining a method with the same name and signature in the same object but without the default keyword. This is especially useful in templates (see section 3.11.2) to provide some default behavior while allowing the user to customize that behavior by overriding a default method where the template is instantiated.

3.7.3 Calling Methods

In DML, a method call is performed with one of the call and inline keywords. For instance,

call $access(...) -> (a, b)

will call the method 'access' in the same object, assigning the values of the output parameters to variables a and b. (Note the '$' character which is necessary for referring to the method.) The call might be inline expanded depending on the C compiler used to compile the generated code, but there is no guarantee.

On the other hand,

inline $access(...) -> (a, b)

has the same semantics as the call, but will always inline expand the called method.

It is possible to invoke the default method in the overriding method using the following special method call syntax (note that there is no $ before default):

call default(...) -> (...);

, or

inline default(...) -> (...);

to expand the overridden method inline.

3.7.4 Methods as Macros

Methods can also be used as macros, by omitting the types on one or more of the input parameters. A method defined this way can only be called through an inline statement; see Section 3.13.5.

For example,

method twice(x) -> (y) { y = x + x; }

could be used to double any numeric value without forcing the result to be of any particular type, as in

int32_t x;
int64_t y;
inline twice(x) -> (x);
inline twice(y) -> (y);

This is sometimes referred to as a polymorphic method.

This form of macros is particularly useful when writing templates (see Section 3.11.2). Note that the name scoping rules and the semantics of calls are the same as for normal methods; in other words, it is a form of "hygienic macros".

3.7.5 External Methods

A method can be declared external, which means that a C function corresponding to the method is guaranteed to be generated. This is done by prefixing the name of the method with the keyword extern; e.g.:

method extern my_method(int x) { ... }

An external method must have a proper signature, i.e., the types of all its input and output parameters must be specified.

External methods are rarely used, since most of the needs for making DML methods accessible from outside the device model itself are covered by the implement, event, and attribute type objects. Note that the generated functions will always have static linkage.

3.8 Data Fields

A data declaration creates a named storage location for an arbitrary run-time value. The name belongs to the same namespace as objects and methods. The general form is:

data declaration = initializer;

where = initializer is optional and declaration is similar to a C variable declaration; for example,

data int id = 1;
data bool active;
data double table[4] = {0.1, 0.2, 0.4, 0.8};
data conf_object_t *obj;

In the absence of explicit initializer expression, a default "all zero" initializer will be applied to the declared object.

Note that the number of elements in the initializer must match with the number of elements or fields of the type of the data object. This also implies that each sub-element, if itself being a compound data structure, must also be enclosed in braces. C99-style designated initializers are not supported.

3.9 Object Declarations

The general form of an object declaration is "type name extras is (template ...) desc { ... }" or "type name extras is (template ...) desc;", where type is an object type such as bank, name is an identifier naming the object, and extras is optional special notation which depends on the object type. The is (template ...) part is optional and will make the object use the named templates. The desc is an optional string constant giving a very short summary of the object. It can consist of several string literals concatenated by the '+' operator. Ending the declaration with a semicolon is equivalent to ending with an empty pair of braces. The body (the section within the braces) may contain parameter declarations, methods, data fields, and object declarations.

For example, a register object may be declared as

register r0 @ 0x0100 "general-purpose register 0";

where the "@ offset" notation is particular for the register object type; see below for details.

Using is (template1, template2) is equivalent to using is statements in the body, so the following two declarations are equivalent:

register r0 @ 0x0100 is (read_only,autoreg);

register r0 @ 0x0100 {
    is read_only;
    is autoreg;
}

An object declaration with a desc section, on the form

type name ... desc { ... }

is equivalent to defining the parameter desc, as in

type name ... {
    parameter desc = desc;
    ...
}

In the following sections, we will leave out desc from the object declarations, since it is always optional. Another parameter, documentation (for which there is no short-hand), may also be defined for each object, and for some object types it is used to give a more detailed description. See Section 4.1.1 for details.)

If two object declarations with the same name occur within the same containing object, and they specify the same object type, then the declarations are concatenated; e.g.,

bank b {
    register r size 4 { ...body1... }
    ...
    register r @ 0x0100 { ...body2... }
    ...
}

is equivalent to

bank b {
    register r size 4 @ 0x0100  {
        ...body1...
        ...body2...
    }
    ...
}

However, it is an error if the object types should differ.

Most object types (bank, register, group, attribute, connect, event, and port) may be used in arrays. The general form of an object array declaration is

type name [variable in 0..max]... extras { ... }

Here, each variable defines the name of one index in the array and 0..max defines the size of the corresponding dimension of the array. Each variable defines a parameter in the object scope, and thus must be unique. max must be a compile time constant. For instance,

register regs[i in 0..15] size 2 {
    parameter offset = 0x0100 + 2 * $i;
    ...
}

or written more compactly

register regs[i in 0..15] size 2 @ 0x0100 + 2 * $i;

defines an array named regs of 16 registers (with index 0 to 15) of 2 bytes each, whose offsets start at 0x0100. There is also a special syntax "type name [size] ...", this is equivalent to declaring "type name [i in 0..size-1]...". See Section 4.1.2 for details about arrays and index parameters.

The following sections give further details on declarations for object types that have special conventions.

3.9.1 Bank Declarations

The general form of a bank declaration is

bank name { ... }

where name may be omitted. The elements (e.g., registers) of a bank that has no name belong to the namespace of the parent object, i.e., the device. There is at most one such anonymous bank object per device and it is not possible to define an anonymous bank array object; multiple "bank { ... }" declarations are concatenated.

3.9.2 Register Declarations

The general form of a register declaration is

register name size n @ d is (templates) { ... }

Each of the "size n", "@ d", and "is (templates)" sections is optional, but if present, they must be specified in the above order.

A declaration

register name size n ... { ... }

is equivalent to

register name ... { parameter size = n; ... }

A declaration

register name ... @ d ... { ... }

is equivalent to

register name  ... { parameter offset = d; ... }

A declaration

register name ... is (t1,...,tN) { ... }

is equivalent to

register name  ... { is t1; ... is tN; ... }

Templates are further described in Section 3.11.2.

3.9.3 Field Declarations

The general form of a field declaration is

field name [highbit:lowbit] is (templates) { ... }

or simply

field name [bit] ... { ... }

specifying a range of bits of the containing register, where the syntax [bit] is short for [bit:bit]. Both the "[...]" and the is (templates) sections are optional; in fact, the "[...]" syntax is merely a much more convenient way of defining the (required) field parameters lsb and msb (cf. Section 4.1.8).

A declaration

field name [high:low] is (t1,...,tN) { ... }

is equivalent to

field name [high:low] { is t1; ... is tN; ... }

For a range of two or more bits, the first (leftmost) number always indicates the most significant bit, regardless of the bit numbering scheme used in the file. This corresponds to how bit fields are usually visualized, with the most significant bit to the left.

The bits of a register are always numbered from zero to n - 1, where n is the width of the register. If the default little-endian bit numbering is used, the least significant bit has index zero, and the most significant bit has index n - 1. In this case, a 32-bit register with two fields corresponding to the high and low half-words may be declared as

register HL size 4 ... {
    field H [31:16];
    field L [15:0];
}

If instead big-endian bit numbering is selected in the file, the most significant bit has index zero, and the least significant bit has the highest index. In that case, the register above may be written as

register HL size 4 ... {
    field H [0:15];
    field L [16:31];
}

This is useful when modeling a system where the documentation uses big-endian bit numbering, so it can be compared directly to the model.

If a register contains a set of fields with similar functionality, then these may be represented as a field array. In this case, the array size must be specified using the verbose i in 0..max syntax, and the bit range specification is preceded by the @ character, like:

register R size 4 ... {
    field A[i in 0..2] @ [i*3 + 2 : i * 3];
}

3.10 Conditional Objects

It is also possible to conditionally include or exclude one or more object declarations, depending on the value of a boolean expression. This is especially useful when reusing source files between several similar models that differ in some of the details.

The syntax is very similar to the if statements used in methods.

if ($enable_target) {
    connect target (
        interface signal;
    }
}

One difference is that the braces are required. It is also possible to add else branches, or else-if branches.

if ($modeltype == "Mark I") {
    ...
} else if ($modeltype == "Mark II" {
    ...
} else {
    ...
}

The general syntax is

if ( conditional ) { object declarations ... }
else if ( conditional ) { object declarations ... }
...
else { object declarations ... }

The conditional is an expression with a constant boolean value. It may use parameters declared at the same level in the object hierarchy, or in parent levels.

The object declarations are any number of declarations of objects or methods, but not parameters. When the conditional is true (or if it's the else branch of a false conditional), the object declarations are treated as if they had appeared without any surrounding if. So the two following declarations are equivalent:

if (true) {
    register R size 4;
} else {
    register R size 2;
}

is equivalent to

register R size 4;

3.11 Global Declarations

The following sections describe the global declarations in DML. These can only occur on the top level of a DML program, i.e., not within an object or method. Unless otherwise noted, their scope is the entire program.

3.11.1 Import Declarations

import filename;

Imports the contents of the named file. filename must be a string literal, such as "utility.dml". The -I option to the dmlc compiler can be used to specify directories to be searched for import files.

If filename starts with ./, the compiler disregards the -I flag, and the path is instead interpreted relative to the directory of the importing file.

Note that imported files are parsed as separate units, and use their own language version and bit order declarations.

A DML 1.2 file is allowed to import a DML 1.4 file, but the exact semantics are not well-specified: The standard libraries of DML 1.4 and DML 1.2 are quite different, and the compiler contains workarounds to bridge some of the differences; these workarounds may evolve in incompatible ways.

3.11.2 Template Declarations

template name { ... }

Defines a template, a piece of code that can be reused in multiple locations. The body of the template contains a number of declarations that will be added to any object that uses the template.

Templates are imported into an object declaration body using is statements, written as

is name;

is (name1, name2);

It is also possible to use templates when declaring an object or a template, as in

field F is name;

template T is (name1,name2) { ... }

These can be used in any context where an object declaration may be written, and has the effect of expanding the body of the template at the point of the is. Note that the expansion is purely textual, so e.g., two templates which define methods with the same name cannot both be used in the same context.

3.11.3 Bitorder Declarations

bitorder order;

Selects the default bit numbering scheme to be used for interpreting bit-slicing expressions and bit field declarations in the file. The order is one of the identifiers le or be, implying little-endian or big-endian, respectively. The little-endian numbering scheme means that bit zero is the least significant bit in a word, while in the big-endian scheme, bit zero is the most significant bit.

A bitorder declaration should be placed before any other global declaration in each DML-file, but must follow immediately after the device declaration if such one is present. The scope of the declaration is the whole of the file it occurs in. If no bitorder declaration is present in a file, the default bit order is le (little-endian). The bitorder does not extend to imported files; for example, if a file containing a declaration "bitorder be;" imports a file with no bit order declaration, the latter file will still use the default le order.

3.11.4 Constant Declarations

constant name = expr;

Defines a named constant which can be used in C context. expr must be a constant-valued expression.

3.11.5 Loggroup Declarations

loggroup name;

Defines a log group, for use in log statements. More generally, the identifier name is bound to an unsigned integer value that is a power of 2, and can be used anywhere in C context; this is similar to a constant declaration, but the value is allocated automatically so that all log groups are represented by distinct powers of 2 and can be combined with bitwise or (see Section 3.13.7).

3.11.6 Typedef Declarations

typedef declaration;
extern typedef declaration;

Defines a name for a data type.

When the extern form is used, the type is assumed to exist in the C environment. No definition of the type is added to the generated C code, and the generated C code may assume that the data type supports the operations of the declared type.

3.11.7 Struct Declarations

struct name { declarations }

This is the same as writing typedef struct { declarations } name;

3.11.8 Extern Declarations

extern declaration;

Declares an external identifier, similar to a C extern declaration; for example,

extern char *motd;
extern double table[16];
extern conf_object_t *obj;
extern int foo(int x);
extern int printf(const char *format, ...);

3.11.9 Header Declarations

header %{
...
%}

Specifies a section of C code which will be included verbatim in the generated C header file for the device. There must be no whitespace between the % and the corresponding brace in the %{ and %} markers. The contents of the header section are not examined in any way by the dmlc compiler; declarations made in C code must also be specified separately in the DML code proper.

This feature should only be used to solve problems that cannot easily be handled directly in DML. It is most often used to make the generated code include particular C header files, as in:

header %{
#include "extra_defs.h"
%}

The expanded header block will appear in the generated C file, which usually is in a different directory than the source DML file. Therefore, when including a file with a relative path, the C compiler will not automatically look for the .h file in the directory of the .dml file, unless a corresponding -I flag is passed. To avoid this problem, DMLC inserts a C macro definition to permit including a companion header file. For instance, if the file /path/to/hello-world.dml includes a header block, then the macro DMLDIR_HELLO_WORLD_H is defined as the string "/path/to/hello-world.h" within this header block. This allows the header block to contain #include DMLDIR_HELLO_WORLD_H, as a way to include hello-world.h without passing -I/path/to to the C compiler.

DMLC only defines one such macro in each header block, by taking the DML file name and substituting the .dml suffix for .h. Furthermore, the macro is undefined after the header. Hence, the macro can only be used to access one specific companion header file; if other header files are desired, then #include directives can be added to the companion header file, where relative paths are expanded as expected.

3.12 Comparison to C/C++

The algorithmic language used to express method bodies in DML is an extended subset of ISO C, with some C++ extensions such as new and delete. The DML-specific statements and expressions are described in Sections 3.13 and 3.14.

DML defines the following additional built-in data types:

int1, ..., int64, uint1, ..., uint64: Signed and unsigned specific-width integer types. Widths from 1 to 64 are allowed.
bool: The generic boolean datatype, consisting of the values true and false. It is not an integer type, and the only implicit conversion is to uint1

DML also supports the non-standard C extension typeof(expr) operator, as provided by some modern C compilers such as GCC.

Most forms of C statements and expressions can be used in DML, with the exceptions listed below. Some of these limitations may be removed in a later version of DML.

Local variable declarations must use the keyword local or static, as in
```
method m() {
    static int call_count = 0;
    local int n = 0;
    local float f;
    ...
}
```
(only one variable can be introduced per declaration). Static variables have a similar meaning as in C, they retain value over function calls. But note that a static variable in DML is per device object, it is not globally shared between device instances. For symmetry with C, the keyword auto may be used as a synonym for local.
Plain C functions (i.e., not DML methods) can be called using normal function call syntax, as in f(x).

C functions must either be defined in the footer or header sections of the DML file, or in one or more C source code files that are compiled or linked to separately.

See sections 3.11.9 and 3.11.10 for more detail on header and footer.

In either case, the functions must also be declared as extern in the DML source code. For example:

foo.c
```
int foo(int i)
{
    return ~i + 1;
}
```
bar.dml
```
// tell DML that these functions are available
extern int foo(int);
extern int bar(int);

header %{
    // tell generated C that these functions are available
    int foo(int);  // defined in separate C file
    int bar(int);  // defined in the DML footer section
%}

footer %{
    int bar(int i)
    {
        return -i;
    }
%}
```
Makefile
```
SRC_FILES=foo.c bar.dml
```
return statements do not take a return value as argument; output parameters of methods must be assigned explicitly.
Type casts must be written as cast(expr, type).
Comparison operators and logical operators produce results of type bool, not integers.
Conditions in if, for, while, etc. must be proper booleans; e.g., if (i == 0) is allowed, and if (b) is allowed if b is a boolean variable, but if (i) is not, if i is an integer.
The sizeof operator can only be used on expressions. To take the size of a datatype, the sizeoftype operator must be used.
Declarations are not allowed in the first clause of for- statements.
Comma-expressions are only allowed in the head of for-statements, as in
```
for (i = 10, k = 0; i > 0; --i, ++k) ...
```
delete and throw can only be used as statements in DML, not as expressions.
throw does not take any argument, and catch cannot switch on the type or value of an exception.
Type declarations do not allow the use of union. However, the extern typedef construct can be used to achieve the same result. For example, consider the union data type declared in C as:
```
typedef union { int i; bool b; } u_t;
```
The data type can be exposed in DML as follows:
```
header %{
  typedef union { int i; bool b; } u_t;
%}
extern typedef struct { int i; bool b; } u_t;
```
This will make u_t look like a struct to DML, but since union and struct syntax is identical in C, the C code generated from uses of u_t will work correctly together with the definition from the header declaration.

3.13 Statements

All ISO C statements are available in DML, and have the same semantics as in C. Like ordinary C expressions, all DML expressions can also be used in expression-statements.

DML adds the following statements:

3.13.1 Delete Statements

delete expr;

Deallocates the memory pointed to by the result of evaluating expr. The memory must have been allocated with the new operator, and must not have been deallocated previously. Equivalent to delete in C++; however, in DML, delete can only be used as a statement, not as an expression.

3.13.2 Try Statements

try protected-stmt catch handle-stmt

Executes protected-stmt; if that completes normally, the whole try-statement completes normally. Otherwise, handle-stmt is executed. This is similar to exception handling in C++, but in DML there is only one kind of exception. Note that Simics C-exceptions are not handled. See also throw.

3.13.3 Throw Statements

throw;

Throws (raises) an exception, which may be caught by a try-statement. Exceptions are propagated over method call boundaries. This is similar to throw in C++, but in DML it is not possible to specify a value to be thrown. Furthermore, in DML, throw is a statement, not an expression.

3.13.4 Call Statements

call method(e1, ... eN) -> (d1, ... dM);

Calls a DML method with input arguments e1, ... eN and output destinations d1, ... dM. The destinations are usually variables, but they can be arbitrary L-values (even bit slices) as long as their types match the method signature.

If the method has no output parameters, the -> () part may be omitted, as in

call p(...);

which is equivalent to call p(...) -> ();.

If the method has no input parameters, the empty pair of parentheses may also be omitted, as in

call q -> (...);

which is equivalent to call q() -> (...);.

A method with neither input nor output parameters may thus be called simply as

call me;

3.13.5 Inline Statements

inline method(e1, ... eN) -> (d1, ... dM);

This is equivalent to call method(e1, ... eN) -> (d1, ... dM); but the code for the called method is expanded at the place of the inline call, and may be partly specialized to the values of any input arguments that are constant at DML compile time.

Furthermore, methods that are only intended for inlining may be declared as a form of polymorphic hygienic macros; see Section 3.7.4.

3.13.6 After Statements

after (time) call method(e1, ... eN);

The after construct sets up an asynchronous event which will perform the specified method call with the provided arguments at the given time into the future (in simulated time, measured in seconds) relative to the time when the after statement is executed. For example:

after (0.1) call $my_callback(1, false);

This is equivalent to creating a named event object with an event-method that performs the specified call, and posting that event at the given time, with associated data corresponding to the provided arguments; see Section 2.11.

Each argument to the called method is evaluated at the time the after statement is executed and the event is posted.

If the method has no input parameters, the empty pair of parentheses may be omitted, as in

after (0.1) call $q;

To allow the posted event to be checkpointed, after statements may only be performed with methods that have no output parameters, and where each input parameter is of serializable type. Serializable types are currently restricted to primitive data types, or structs or arrays containing only data types that are serializable. This means that after statements cannot be used with methods that e.g. have pointer parameters.

3.13.7 Log Statements

log log-type, level, groups: format-string, e1, ..., eN;

Outputs a formatted string to the Simics logging facility. The string following the colon is a normal C printf format string, optionally followed by one or more arguments separated by commas. (The format string should not contain the name of the device, or the type of the message, e.g., "error:..."; these things are automatically prefixed.) Either both of level and groups may be omitted, or only the latter; i.e., if groups is specified, then level must also be given explicitly.

A Simics user can configure the logging facility to show only specific messages, by matching on the three main properties of each message:

The log-type specifies the general category of the message. The value must be one of the identifiers info, error, critical, spec_viol, or unimpl. Log types can also be written using the alternative syntaxes "info", "error", "critical", "spec_violation", and "unimplemented", respectively. This alternative syntax is only reserved for backward compatibility and may be removed in future language versions.
The level specifies at what verbosity level the log messages are displayed. The value must be an integer from 1 to 4; if omitted, the default level is 1. The different levels have the following meaning:
1. Important messages (displayed at the normal verbosity level)
2. High level informative messages (like mode changes and important events)
3. Medium level information (the lowest log level for SW development)
4. Debugging level with low level model detail (Mainly used for model development)
The groups argument is an integer whose bit representation is used to select which log groups the message belongs to. If omitted, the default value is 0. The log groups are specific for the device, and must be declared using the loggroup device-level declaration. For example, a DML source file containing the declarations
```
loggroup good;
loggroup bad;
loggroup ugly;
```
could also contain a log statement such as
```
log info, 2, (bad | ugly): "...";
```
(note the | bitwise-or operator), which would be displayed if the user chooses to view messages from group bad or ugly, but not if only group good is shown.

Groups allow the user to create arbitrary classifications of log messages, e.g., to indicate things that occur in different states, or in different parts of the device, etc. The two log groups Register_Read and Register_Write are predefined by DML, and are used by several of the built-in methods.

The format-string should be one or several string literals concatenated by the '+' operator, all optionally surrounded by round brackets.

See also Simics Model Builder User's Guide, section "Logging", for further details.

3.13.8 Assert Statements

assert expr;

Evaluates expr. If the result is true, the statement has no effect; otherwise, a runtime-error is generated. expr must have type bool.

3.13.9 Error Statements

error string;

Attempting to compile an error statement causes the compiler to generate an error, using the specified string as error message. The string may be omitted; in that case, a default error message is used.

The string, if present, should be one or several string literals concatenated by the '+' operator, all optionally surrounded by round brackets.

3.13.10 Foreach Statements

foreach identifier in (expr) statement

The foreach statement repeats its body (the statement part) once for each element in the list given by expr. The identifier is used to refer to the current element within the body. It is not used with a $ prefix.

If expr is a list, it is always a DML compile-time constant, and in that case the loop is completely unrolled by the DML compiler. This can be combined with tests on the value of identifier within the body, which will be evaluated at compile time.

For example:

foreach x in ([3,2,1]) {
    if (x == 1) foo();
    else if (x == 2) bar();
    else if (x == 3) baz();
    else error "out of range";
}

would be equivalent to

baz();
bar();
foo();

Only if can be used to make such selections; switch statements are not evaluated at compile time. (Also note the use of error above to catch any compile-time mistakes.)

The break statement can be used within a foreach loop to exit it.

3.13.11 Select Statements

select identifier in (expr) where (cond-expr) statement else default-statement

The select statement resembles a C switch statement and it is very similar to the foreach statement, but executes the statement exactly once for the first matching element in the list given by expr, i.e., for the first element such that cond-expr is true; or if no element matches, it executes the default-statement.

If expr is a list, and the cond-expr only depends on compile-time constants, apart from identifier, then the choice will be performed by the DML compiler, and code will only be generated for the selected case.

3.14 Expressions

All ISO C operators are available in DML, except for certain limitations on the comma-operator, the sizeof operator, and type casts; see Section x. Operators have the same precedences and semantics as in C

DML adds the following expressions:

3.14.1 The Undefined Constant

undefined

The constant undefined is an abstract compile-time only value, mostly used as a default for parameters that are intended to optionally be overridden. The undefined expression may only appear as a parameter value, as argument to the defined expr test (see below), and as the value for an untyped parameter when invoking a method with inline.

3.14.2 References

$identifier

To reference something in the DML object structure, the reference (an object-context identifier) must be prefixed by a $ character; see also Section 3.2. Following the identifier, subobjects may be selected using . and -> as in C. (However, most objects in the DML object structure are proper substructures selected with the . operator.) For example,

$this.size # a parameter
$dev.bank1 # a bank object
$bank1.r0.hard_reset # a method

The DML object structure is a compile-time construction; references to certain objects are not considered to be proper values, and result in compile errors if they occur as standalone expressions.

Some DML objects are proper values, while others are not:

data objects are proper values
register, field and attribute objects are proper values when they have allocated storage.
Objects of types method, interface, implement and device are technically considered run-time values, but the exact meaning is undefined so their use is discouraged.
A bank, group, connect, port or event object is not a value.
Inside an object array, the index variable (named $i by default) may evaluate to an unknown index if accessed from a location where the index is not statically known. For instance, in group g[i in 0..3] { if ($i == 0) { ... } }, the if statement is invoked once, statically, across all indices, meaning that the $i reference is an unknown index, and will yield a compile error.
A reference to a parameter object is a proper value only if the parameter value is a proper value: A parameter value can be a reference to an object, an object array, a list, the undefined expression, or a static index (discussed above), in which case the parameter is not allowed as a standalone expression.
When the object structure contains an array of objects, e.g. register r[4] { ... }, then a reference to the array itself (i.e. $r as opposed to $r[0]), is not considered a proper value.

If a DML object is not a proper value, then a reference to the object will give a compile error unless it appears in one of the following contexts:

As the left operand of the . operator
As the value of a parameter object
As a list element in a compile-time list
As the operand of the defined operator
A method object may be invoked in a call or inline statement
An object array may appear in an index expression array[index]
A register or field may appear as a typeof operand, even if the object does not have allocated storage.
An unknown index may be used as an index to an object array; in the resulting object reference, the corresponding index variable of the object array will have an unknown value.

3.14.3 New Expressions

new type

new type[count]

Allocates a chunk of memory large enough for a value of the specified type. If the second form is used, memory for count values will be allocated. The result is a pointer to the allocated memory. (The pointer is never null; if allocation should fail, the Simics application will be terminated.)

When the memory is no longer needed, it should be deallocated using a delete statement.

3.14.4 Cast Expressions

cast(expr, type)

Type casts in DML must be written with the above explicit cast operator, for syntactical reasons.

Semantically, cast(expr, type) is equivalent to the C expression (type) expr.

3.14.5 Sizeoftype Expressions

sizeoftype type

The sizeof operator in DML can only be used on expressions, not on types, for syntactical reasons. To take the size of a datatype, the sizeoftype operator must be used, as in

int size = sizeoftype io_memory_interface_t;

Semantically, sizeoftype type is equivalent to the C expression sizeof (type).

3.14.6 Defined Expressions

defined expr

This compile-time test evaluates to false if expr has the value undefined, and to true otherwise.

3.14.7 List Expressions

[e1, ..., eN]

A list is a compile-time only value, and is an ordered sequence of zero or more compile-time constant values. Lists are in particular used in combination with foreach and select statements, and are sometimes provided by built-in parameters, such as the fields parameter of register objects.

A list expression may only appear in the following contexts:

As the list to iterate over in a foreach or select statement
As the value in a parameter or constant declaration
As a list element in another compile-time list
In an index expression, list[index]
As the operand of the defined operator

3.14.8 Bit Slicing Expressions

expr[e1:e2]

expr[e1:e2, bitorder]

expr[e1]

expr[e1, bitorder]

If expr is of integer type, then the above bit-slicing syntax can be used in DML to simplify extracting or updating particular bit fields of the integer. Bit slice syntax can be used both as an expression producing a value, or as the target of an assignment (an L-value), e.g., on the left-hand side of an = operator.

Both e1 and e2 must be integers. The syntax expr[e1] is a short-hand for expr[e1:e1] (but only evaluating e1 once).

The bitorder part is optional, and selects the bit numbering scheme (the "endianness") used to interpret the values of e1 and e2. If present, it must be one of the identifiers be or le, just as in the bitorder device-level declaration. If no bitorder is given in the expression, the global bit numbering (as defined by the bitorder declaration) is used.

The first bit index e1 always indicates the most significant bit of the field, regardless of the bit numbering scheme; cf. Section 3.9.3. If the default little-endian bit numbering is used, the least significant bit of the integer has index zero, and the most significant bit of the integer has index n - 1, where n is the width of the integer type.

If big-endian bit numbering is used, e.g., due to a bitorder be; declaration in the file, or using a specific local bit numbering as in expr[e1:e2, be], then the bit corresponding to the little-endian bit number n - 1 has index zero, and the least significant bit has the index n - 1, where n is the bit width of expr. Note that big-endian numbering is illegal if expr isn't a simple expression with a well-defined bit width. This means that only local variables, method parameters, device variables (registers, data etc), and explicit cast expressions are allowed. For little-endian numbering, any expressions are allowed, since there is never any doubt that bit 0 is the least significant bit.

If the bit-slicing expression results in a zero or negative sized range of bits, the behavior is undefined.

3.14.9 Stringify Expressions

# expr

Translates the value of expr (which must be a compile-time constant) into a string constant. This is similar to the use of # in the C preprocessor, but is performed on the level of compile time values, not tokens. The result is often used with the + string operator.

3.14.10 String Concatenation Expressions

expr1 + expr2

If both expr1 and expr2 are compile-time string constants, the expression expr1 + expr2 concatenates the two strings at compile time. This is often used in combination with the # operator, or to break long lines for source code formatting purposes.

2 The Object Model 4 Libraries and Built-ins

3 DML 1.2

3.1 Overview

3.2 Lexical Structure

3.3 Module System

3.4 Source File Structure

3.4.1 Language Version Declaration

3.4.2 Device Declaration

3.4.3 Source File Example

3.5 Parameter Declarations

3.6 Data types

3.7 Methods

3.7.1 Input and Output Parameters

3.7.2 Default Methods

3.7.3 Calling Methods

3.7.4 Methods as Macros

3.7.5 External Methods

3.8 Data Fields

3.9 Object Declarations

3.9.1 Bank Declarations

3.9.2 Register Declarations

3.9.3 Field Declarations

3.10 Conditional Objects

3.11 Global Declarations

3.11.1 Import Declarations

3.11.2 Template Declarations

3.11.3 Bitorder Declarations

3.11.4 Constant Declarations

3.11.5 Loggroup Declarations

3.11.6 Typedef Declarations

3.11.7 Struct Declarations

3.11.8 Extern Declarations

3.11.9 Header Declarations

3.11.10 Footer Declarations

3.12 Comparison to C/C++

3.13 Statements

3.13.1 Delete Statements

3.13.2 Try Statements

3.13.3 Throw Statements

3.13.4 Call Statements

3.13.5 Inline Statements

3.13.6 After Statements

3.13.7 Log Statements

3.13.8 Assert Statements

3.13.9 Error Statements

3.13.10 Foreach Statements

3.13.11 Select Statements

3.14 Expressions

3.14.1 The Undefined Constant

3.14.2 References

3.14.3 New Expressions

3.14.4 Cast Expressions

3.14.5 Sizeoftype Expressions

3.14.6 Defined Expressions

3.14.7 List Expressions

3.14.8 Bit Slicing Expressions

3.14.9 Stringify Expressions

3.14.10 String Concatenation Expressions