system-info system-perfmeter-summary
Simics Reference Manual  /  3 Commands  /  3.2 Command List  / 

system-perfmeter

Synopsis

system-perfmeter [sample_time] [mode] [file] [-deactivate] [-realtime] [-cpu-idle] [-cpu-exec-mode] [-cpu-host-ticks] [-cpu-host-ticks-raw] [-cell-host-ticks] [-cell-host-ticks-raw] [-summary] [-summary-always] [-module-profile] [-window] [-top] [-disabled] [-mips] [-emips] [-multicore-accelerator] [-mem] [-shared] [-io] [-mips-win] [-no-log] [-only-current-cell] [-include-stop-time]

Description

Activates performance measurement on one or more systems running within Simics. The resulting printouts gives an idea on how fast Simics executes and can be useful to identify opportunities for optimization.

The command periodically outputs various performance related counters for the period, called a sample. Counters measure activity during the period, unless otherwise noted. The counters are also accumulated and can be presented in a summary. Each time the command is given, all accumulated counters are reset to zero.

The sample output contains a number of columns; Total-vt (virtual time) and Total-rt (real time) is the accumulated number of seconds that has been executed on the system since the command was issued. Similarly, Sample-vt and Sample-rt is the sample time in seconds. Slowdown measures the ratio between the sample virtual time and real time. CPU indicates how much host CPU that was used by Simics during the sample, where 100% equals to one cpu running during the whole sample (as Simics is multi-threaded, this number can be much larger than 100). Idle represent how much all CPUs in the system was in idle during the sample. Instructions that do not compute anything, like the x86 halt instruction, and non-computing loops detected by Simics (see hypersim-status) are defined as idle instructions. A large idle percentage means that Simics can fast-forward time more, and hence gives better performance.

Virtual time is measured on the current cycle object (selectable with pselect) when the command is given.

To disable the system perfmeter use -deactivate.

Output Presentation

How frequent the measurements should be presented is controlled with the sample_time parameter which represent the time that should elapse for each sample, default is one second. Default is to sample based on virtual time, but using -realtime switches the sampling to be based on real (host) time.

The system-perfmeter will subtract any time when the simulation is not running from the measured wall-clock time. This allows the simulation to be temporary stopped in the middle of the execution without corrupting the measurement. The -include-stop-time flag prevents this subtraction from happening, allowing the actual real-time to be shown.

The -summary causes a summary report to be printed out each time simulation is stopped. It includes the same counters that you get for each sample, but the numbers are calculated based on the whole run, not just a sample, since the command was issued. The summary also includes performance hints as well as system info about target and host. A summary is only printed if at least one sample has been printed since the last time Simics stopped. The -summary-always flag prints the summary information each time Simics stops instead. The system-perfmeter-summary command also prints the summary report.

With -only-current-cell, metrics are only collected for the cell of the currently selected frontend object at the time when the command is run (selected with pselect). Global metrics such as mem will still include the entire simulation. If -only-current-cell is not specified, then metrics are based on all cells.

Output Redirection

Normally a text line with results is written as an output each measured sample. The -window flag opens a separate text window where the continued output is written instead of printing this in the Simics console. If no output is wanted at all, -no-log can be used (can be useful when running with only -top or -mips-win).

The console printouts can be sent to a file specified by the file argument. The default is to output the result to the Simics console. If -window is used together with a specified file, the output is written both to the file and to the separate window.

The -top flag opens a separate text window displaying some statistics on the execution, similar to the Linux top utility.

Similar, -mips-win opens a window displaying only the current MIPS value which can be useful for demonstration.

Convenience Argument

The optional mode argument can take one of "minimum", "normal" and "detailed" as its value. Each mode selects a number of the flags described below. Using a mode, flags can also be specified separately.
- The "minimum" mode includes -emips, -realtime and -summary.
- The "normal" mode includes -cpu-exec-mode, -cpu-idle, -emips, -realtime and -summary.
- The "detailed" mode includes -cpu-exec-mode, -cpu-host-ticks, -cpu-idle, -emips, -io, -mem, -module-profile, -realtime and -summary.

Counter Selection

All of the below flags are used to add various counters to the sample. Instruction mode per cpu and host tick counters are grouped using brackets. An explanation of the label of each column in the brackets is printed when turning on profiling and when the summary is printed.

Instructions can be simulated in four different simulator modes: idle, interpreter, JIT, or VMP. For each processor, the percentage run in this mode out of all instructions run on the processor during the sample can be shown. -cpu-exec-mode will show numbers for processor instructions in JIT and VMP mode. -cpu-idle will show numbers for idle mode instructions. Interpreter mode is not shown, except in the summary. Columns are grouped per mode, and modes are sorted idle, JIT, VMP from left to right. If no instructions at all were executed during the sample, the processor is considered disabled and DIS is shown. Note that the absolute number of instructions may vary per processor (due to CPI, frequency, idle). Also, note that clocks have no instructions and are not shown, but are included in the number of processors in the summary.

Another group of values (one value per processor/cell, group placed to the far right) is added by -cpu-host-ticks. This shows how much real time each processor/cell takes to simulate. This can either be a percentage value of total host time when processors simulate, or an absolute value, counted in ticks, if using -cpu-host-ticks-raw. Execution outside a cell are excluded and such ticks are ignored. Execution inside a cell, but not executing a processor are reported in the "Outside Processors" column. A tick is a time unit defined by the host OS, on Linux usually 10 ms.

When running with a multi-cell configuration with many processors, -cell-host-ticks or -cell-host-ticks-raw can be used similar to the -cpu-host-ticks* switches. This provides a more narrow list of how much host processor that is needed to simulate each cell. Execution that falls outside any cell is placed in an "Outside cell" ("oc") column.

The -mips flag appends some MIPS values indicating how many million instruction per real second Simics has executed. The MIPS number printed is the number of instruction executed, including idle instructions. To see the MIPS value without the idle instructions (where only the instructions that are really executed in Simics are counted) you can use -emips.

The -multicore-accelerator tracks and prints the percentage of execution when Multicore Accelerator is both enabled and actually used. Even when Multicore Accelerator is enabled, it may not actually be used since there is a mechanism that monitors the simulation and falls back to classic non-threaded execution within each cell if there would not be a benefit from additional threading. See the Accelerator User's Guide for more information on Multicore Accelerator.

With -io, the number of instructions per I/O operation is calculated and presented in the output. An I/O operation is any memory access that is not terminated in a Simics ram or rom object and thus includes memory mapped I/O.

In some configurations, processors might be disabled at start and started later by software. To see how many of the processors that are disabled at the end of each sample use -disabled. The Disabled column shows how many CPUs and the percent of the total system which are not currently activated.

The -mem flag show the total amount of memory consumed by all instances of the image class (RAM, disk etc.) at the end of the sample. It is measured as the percentage of the memory-limit. If this number goes down compared to the previous sample it means that memory-limit has been reached and Simics has swapped out dirty pages to disk.

Simics can share identical pages across multiple simulated targets, if this feature is enabled. If the targets for instance run the same OS, Simics can keep one copy of a page instead of multiple copies, which consequently reduces host memory consumption. To see how much memory is currently saved at the end of the sample, -shared can be used. Notice that this figure only shows how much "image" memory that is saved. The page sharing mechanism can also reduce internal state, but this memory reduction is not accounted for.

Specifying -module-profile enables profiling of the simulator. Prints the percentage of real time spent in each module. Only printed in summary.

Provided By

system-perfmeter

See Also

pselect, system-perfmeter-summary
system-info system-perfmeter-summary