p3analysis.data package

Module contents

p3analysis.data.projection(df, problem=['problem'], application=['application'], platform=['platform'])

Project data onto definitions of problem, application and platform.

The result of a projection is a DataFrame suitable for use with functionality provided by the p3analysis.metrics, p3analysis.plot and p3analysis.report modules.

Parameters:
  • df (DataFrame) – A pandas DataFrame storing raw performance data.

  • problem (list,optional) –

    A list of column names in df that define the required projection.

    Values from the columns specified in each of these lists will be concatenated to form new “problem”, “application” and “platform” columns. If no column names are provided, columns named “problem”, “application” and “platform” are assumed to already exist.

  • application (list,optional) –

    A list of column names in df that define the required projection.

    Values from the columns specified in each of these lists will be concatenated to form new “problem”, “application” and “platform” columns. If no column names are provided, columns named “problem”, “application” and “platform” are assumed to already exist.

  • platform (list,optional) –

    A list of column names in df that define the required projection.

    Values from the columns specified in each of these lists will be concatenated to form new “problem”, “application” and “platform” columns. If no column names are provided, columns named “problem”, “application” and “platform” are assumed to already exist.

Returns:

A new pandas DataFrame storing the projected data.

Return type:

DataFrame

Raises:
  • ValueError – If any of the column names provided in problem, application or platform are missing from df.

  • TypeError – If problem, application or platform are not lists of strings.

Examples

>>> df = pd.DataFrame({'fom': [1.0, 2.0],
...                    'language': ['OpenMP', 'OpenMP'],
...                    'branch': ['master', 'optimized'],
...                    'architecture': ['CPU', 'CPU'],
...                    'compiler': ['gcc', 'icc'],
...                    'kernel': ['DGEMM', 'DGEMM'],
...                    'M': ['1024', '1024'],
...                    'N': ['1024', '1024'],
...                    'K': ['1024', '1024']})
>>> df = p3analysis.data.projection(df,
...                         problem=['kernel', 'M', 'N', 'K'],
...                         application=['language', 'branch'],
...                         platform=['architecture', 'compiler'])
>>> df
   fom               problem       application platform
0  1.0  DGEMM-1024-1024-1024     OpenMP-master  CPU-gcc
1  2.0  DGEMM-1024-1024-1024  OpenMP-optimized  CPU-icc