.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples/metrics/multiple_components.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_metrics_multiple_components.py: Handling Software with Multiple Components ========================================== Viewing applications as composites. When working with very large and complex pieces of software, reporting performance using a single number (e.g., total time-to-solution) obscures details about the performance of different software components. Using such totals during P3 analysis therefore prevents us from understanding how different software components behave on different platforms. Identifying which software components have poor P3 characteristics is necessary to understand what action(s) we can take to improve the P3 characteristics of a software package as a whole. Although accounting for multiple components can make data collection and analysis slightly more complicated, the additional insight it provides is very valuable. .. tip:: This approach can be readily applied to parallel software written to heterogeneous programming frameworks (e.g., CUDA, OpenCL, SYCL, Kokkos), where distinct "kernel"s can be identified and profiled easily. For a real-life example of this approach in practice, see "`A Performance-Portable SYCL Implementation of CRK-HACC for Exascale `_. Data Preparation ---------------- To keep things simple, let's imagine that our software package consists of just two components, and that each component has two different implementations that can both be run on two different machines: .. list-table:: :widths: 20 20 20 20 :header-rows: 1 * - component - implementation - machine - fom * - Component 1 - Implementation 1 - Cluster 1 - 2.0 * - Component 2 - Implementation 1 - Cluster 1 - 5.0 * - Component 1 - Implementation 2 - Cluster 1 - 3.0 * - Component 2 - Implementation 2 - Cluster 1 - 4.0 * - Component 1 - Implementation 1 - Cluster 2 - 1.0 * - Component 2 - Implementation 1 - Cluster 2 - 2.5 * - Component 1 - Implementation 2 - Cluster 2 - 0.5 * - Component 1 - Implementation 2 - Cluster 2 - 3.0 Our first step is to project this data onto P3 definitions, treating the functionality provided by each component as a separate problem to be solved: .. GENERATED FROM PYTHON SOURCE LINES 91-101 .. code-block:: Python proj = p3analysis.data.projection( df, problem=["component"], application=["implementation"], platform=["machine"], ) print(proj) .. rst-class:: sphx-glr-script-out .. code-block:: none problem application platform fom 0 Component 1 Implementation 1 Cluster 1 2.0 1 Component 2 Implementation 1 Cluster 1 5.0 2 Component 1 Implementation 2 Cluster 1 3.0 3 Component 2 Implementation 2 Cluster 1 4.0 4 Component 1 Implementation 1 Cluster 2 1.0 5 Component 2 Implementation 1 Cluster 2 2.5 6 Component 1 Implementation 2 Cluster 2 0.5 7 Component 2 Implementation 2 Cluster 2 3.0 .. GENERATED FROM PYTHON SOURCE LINES 118-127 .. note:: See ":ref:`Understanding Data Projection `" for more information about projection. Application Efficiency per Component ------------------------------------ Having projected the performance data onto P3 definitions, we can now compute the application efficiency for each component: .. GENERATED FROM PYTHON SOURCE LINES 127-131 .. code-block:: Python effs = p3analysis.metrics.application_efficiency(proj) print(effs) .. rst-class:: sphx-glr-script-out .. code-block:: none problem platform application fom app eff 0 Component 1 Cluster 1 Implementation 1 2.0 1.000000 1 Component 2 Cluster 1 Implementation 1 5.0 0.800000 2 Component 1 Cluster 1 Implementation 2 3.0 0.666667 3 Component 2 Cluster 1 Implementation 2 4.0 1.000000 4 Component 1 Cluster 2 Implementation 1 1.0 0.500000 5 Component 2 Cluster 2 Implementation 1 2.5 1.000000 6 Component 1 Cluster 2 Implementation 2 0.5 1.000000 7 Component 2 Cluster 2 Implementation 2 3.0 0.833333 .. GENERATED FROM PYTHON SOURCE LINES 132-139 .. note:: See ":ref:`Working with Application Efficiency `" for more information about application efficiency. Plotting a graph for each platform separately is a good way to visualize and compare the application efficiency of each component: .. GENERATED FROM PYTHON SOURCE LINES 139-160 .. code-block:: Python cluster1 = effs[effs["platform"] == "Cluster 1"] pivot = cluster1.pivot(index="application", columns=["problem"])["app eff"] pivot.plot( kind="bar", xlabel="Component", ylabel="Application Efficiency", title="Cluster 1", ) plt.savefig("cluster1_application_efficiency_bars.png") cluster2 = effs[effs["platform"] == "Cluster 2"] pivot = cluster2.pivot(index="application", columns=["problem"])["app eff"] pivot.plot( kind="bar", xlabel="Component", ylabel="Application Efficiency", title="Cluster 2", ) plt.savefig("cluster2_application_efficiency_bars.png") .. rst-class:: sphx-glr-horizontal * .. image-sg:: /examples/metrics/images/sphx_glr_multiple_components_001.png :alt: Cluster 1 :srcset: /examples/metrics/images/sphx_glr_multiple_components_001.png :class: sphx-glr-multi-img * .. image-sg:: /examples/metrics/images/sphx_glr_multiple_components_002.png :alt: Cluster 2 :srcset: /examples/metrics/images/sphx_glr_multiple_components_002.png :class: sphx-glr-multi-img .. GENERATED FROM PYTHON SOURCE LINES 161-174 On Cluster 1, Implementation 1 delivers the best performance for Component 1, but Implementation 2 delivers the best performance for Component 2. On Cluster 2, that trend is reversed. Clearly, there is no single implementation that delivers the best performance everywhere. Overall Application Efficiency ------------------------------ Computing the application efficiency of the software package as a whole requires a few more steps. First, we need to compute the total time taken by each application on each platform: .. GENERATED FROM PYTHON SOURCE LINES 174-179 .. code-block:: Python package = proj.groupby(["platform", "application"], as_index=False)["fom"].sum() package["problem"] = "Package" print(package) .. rst-class:: sphx-glr-script-out .. code-block:: none platform application fom problem 0 Cluster 1 Implementation 1 7.0 Package 1 Cluster 1 Implementation 2 7.0 Package 2 Cluster 2 Implementation 1 3.5 Package 3 Cluster 2 Implementation 2 3.5 Package .. GENERATED FROM PYTHON SOURCE LINES 180-181 Then, we can use this data to compute application efficiency, as below: .. GENERATED FROM PYTHON SOURCE LINES 181-185 .. code-block:: Python effs = p3analysis.metrics.application_efficiency(package) print(effs) .. rst-class:: sphx-glr-script-out .. code-block:: none problem platform application fom app eff 0 Package Cluster 1 Implementation 1 7.0 1.0 1 Package Cluster 1 Implementation 2 7.0 1.0 2 Package Cluster 2 Implementation 1 3.5 1.0 3 Package Cluster 2 Implementation 2 3.5 1.0 .. GENERATED FROM PYTHON SOURCE LINES 186-208 These latest results suggest that both Implementation 1 and Implementation 2 are both achieving the best-known performance when running the package as a whole. This isn't *strictly* incorrect, since the values of their combined figure-of-merit *are* the same, but we know from our earlier per-component analysis that it could be possible to achieve better performance results. Specifically, our per-component analysis shows us that an application that could pick and choose the best implementation of different components for different platforms would achieve better overall performance. .. important:: Combining component implementations in this way is purely hypothetical, and there may be very good reasons (e.g., incompatible data structures) that an application is unable to use certain combinations. Although removing such invalid combinations would result in a tighter upper bound, it is much simpler to leave them in place. Including all combinations may even identify potential opportunities to combine approaches that initially appeared incompatible (e.g., by writing routines to convert between data structures). We can fold that observation into our P3 analysis by creating an entry in our dataset that represents the results from a hypothetical application: .. GENERATED FROM PYTHON SOURCE LINES 208-215 .. code-block:: Python hypothetical_components = proj.groupby(["problem", "platform"], as_index=False)[ "fom" ].min() hypothetical_components["application"] = "Hypothetical" print(hypothetical_components) .. rst-class:: sphx-glr-script-out .. code-block:: none problem platform fom application 0 Component 1 Cluster 1 2.0 Hypothetical 1 Component 1 Cluster 2 0.5 Hypothetical 2 Component 2 Cluster 1 4.0 Hypothetical 3 Component 2 Cluster 2 2.5 Hypothetical .. GENERATED FROM PYTHON SOURCE LINES 216-227 .. code-block:: Python # Calculate the combined figure of merit for both components hypothetical_package = hypothetical_components.groupby( ["platform", "application"], as_index=False, )["fom"].sum() hypothetical_package["problem"] = "Package" # Append the hypothetical package data to our previous results package = pd.concat([package, hypothetical_package], ignore_index=True) print(package) .. rst-class:: sphx-glr-script-out .. code-block:: none platform application fom problem 0 Cluster 1 Implementation 1 7.0 Package 1 Cluster 1 Implementation 2 7.0 Package 2 Cluster 2 Implementation 1 3.5 Package 3 Cluster 2 Implementation 2 3.5 Package 4 Cluster 1 Hypothetical 6.0 Package 5 Cluster 2 Hypothetical 3.0 Package .. GENERATED FROM PYTHON SOURCE LINES 228-231 As expected, our new hypothetical application achieves better performance by mixing and matching different implementations. And if we now re-compute application efficiency with this data included: .. GENERATED FROM PYTHON SOURCE LINES 231-235 .. code-block:: Python effs = p3analysis.metrics.application_efficiency(package) print(effs) .. rst-class:: sphx-glr-script-out .. code-block:: none problem platform application fom app eff 0 Package Cluster 1 Implementation 1 7.0 0.857143 1 Package Cluster 1 Implementation 2 7.0 0.857143 2 Package Cluster 2 Implementation 1 3.5 0.857143 3 Package Cluster 2 Implementation 2 3.5 0.857143 4 Package Cluster 1 Hypothetical 6.0 1.000000 5 Package Cluster 2 Hypothetical 3.0 1.000000 .. GENERATED FROM PYTHON SOURCE LINES 236-248 ... we see that the application efficiency of Implementation 1 and Implementation 2 has been reduced accordingly. Including hypothetical upper-bounds of performance in our dataset can therefore be a simple and effective way to improve the accuracy of our P3 analysis, even if a true theoretical upper-bound (i.e., from a performance model) is unknown. .. note:: The two implementations still have the *same* efficiency, even after introducing the hypothetical implementation. Per-component analysis is still required to understand how each component contributes to the overall efficiency, and to identify which component(s) should be improved on which platform(s). .. GENERATED FROM PYTHON SOURCE LINES 250-267 Further Analysis ---------------- Computing application efficiency is often simply the first step of a more detailed P3 analysis. The examples below show how we can use the visualization capabilities of the P3 Analysis Library to compare the efficiency of different applications running across the same platform set, or to gain insight into how an application's efficiency relates to the code it uses on each platform. .. minigallery:: :add-heading: Examples ../../examples/cascade/plot_simple_cascade.py ../../examples/navchart/plot_simple_navchart.py .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.244 seconds) .. _sphx_glr_download_examples_metrics_multiple_components.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: multiple_components.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: multiple_components.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: multiple_components.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_