p3analysis.data package¶
Module contents¶
- p3analysis.data.projection(df, problem=['problem'], application=['application'], platform=['platform'])¶
Project data onto definitions of problem, application and platform.
The result of a projection is a DataFrame suitable for use with functionality provided by the
p3analysis.metrics
,p3analysis.plot
andp3analysis.report
modules.- Parameters:
df (DataFrame) – A pandas DataFrame storing raw performance data.
problem (list,optional) –
A list of column names in df that define the required projection.
Values from the columns specified in each of these lists will be concatenated to form new “problem”, “application” and “platform” columns. If no column names are provided, columns named “problem”, “application” and “platform” are assumed to already exist.
application (list,optional) –
A list of column names in df that define the required projection.
Values from the columns specified in each of these lists will be concatenated to form new “problem”, “application” and “platform” columns. If no column names are provided, columns named “problem”, “application” and “platform” are assumed to already exist.
platform (list,optional) –
A list of column names in df that define the required projection.
Values from the columns specified in each of these lists will be concatenated to form new “problem”, “application” and “platform” columns. If no column names are provided, columns named “problem”, “application” and “platform” are assumed to already exist.
- Returns:
A new pandas DataFrame storing the projected data.
- Return type:
DataFrame
- Raises:
ValueError – If any of the column names provided in problem, application or platform are missing from df.
TypeError – If problem, application or platform are not lists of strings.
Examples
>>> df = pd.DataFrame({'fom': [1.0, 2.0], ... 'language': ['OpenMP', 'OpenMP'], ... 'branch': ['master', 'optimized'], ... 'architecture': ['CPU', 'CPU'], ... 'compiler': ['gcc', 'icc'], ... 'kernel': ['DGEMM', 'DGEMM'], ... 'M': ['1024', '1024'], ... 'N': ['1024', '1024'], ... 'K': ['1024', '1024']}) >>> df = p3analysis.data.projection(df, ... problem=['kernel', 'M', 'N', 'K'], ... application=['language', 'branch'], ... platform=['architecture', 'compiler']) >>> df fom problem application platform 0 1.0 DGEMM-1024-1024-1024 OpenMP-master CPU-gcc 1 2.0 DGEMM-1024-1024-1024 OpenMP-optimized CPU-icc