Information Categories of the Benchmark Characterization
Methodology and Objectives.
| Information Category |
Questions to be Answered and the Target Audience |
| Application description |
A basic understanding of the problem being solved by a benchmark is
important for all user classes. For many users it is important to understand
how representative of a specific discipline a benchmark is. The latter
also helps in selecting the benchmarks and in evaluating the quality of
the results. |
| Program description |
Describes programming languages, size, system requirements (cpu, memory,
I/O), number of subroutines, and software libraries. The problem is described
from a software engineering perspective. |
| Application component structure |
Describes the coarse-grain structure of the application and opportunities
for execution across parallel and/or heterogeneous, (globally) distributed
systems. |
| Algorithmic structure |
Algorithmic decomposition is important for the general understanding,
but specifically aimed at algorithm research. This category includes measured
improvements due to algorithmic changes. |
| Measured performance |
This addresses basic performance questions of all audiences. It indicated
basic scalability, suitability of a code for specific machines, effectiveness
of exploiting parallel processors, and sensitivity to changes in input
parameters. It also states the baseline for improvements of algorithms,
compilers, and architectures. |
| Analysis with respect to programming models |
SPEChpc benchmarks include a message passing parallel and a shared-memory
(directive-parallel) code variant. Message-passing profiles answer questions
about:
-
suitability of a distributed-memory architecture, its interconnections,
and message libraries,
-
scalability beyond the measured range,
-
opportunities for communication optimizations,
-
and heterogeneity of the application and coupling of the components.
Shared-memory analyses indicate:
-
sources of performance loss,
-
data-sharing patterns,
-
and opportunities for shared data optimizations.
|
| I/O characterization |
Leads to a basic understanding of the application's I/O component.
Shows scalability with respect to I/O, and sensitivity of the application
with respect to the machine's I/O subsystems, (e.g., parallel disk organizations
and partitioning schemes.) |
| Cache analysis |
Shows the application's locality properties. Gives detailed insight
into the data sharing behavior of the shared-memory application version
and the data reuse of individual nodes in the message-passing version.
The analysis follows the comprehensive cache characterization model introduced
in Technical
report, HPCLAB, 98: A methodology and a tool for cache characterization of loop-parallel programs. |
| Instruction analysis |
Shows the number and type of instructions executed and the instruction
profile (e.g., loads vs. stores.) Indicates detailed timing information,
(e.g., pipeline stalls.) This information is the basis for the detailed
performance analysis of individual code sections. |
| Program analysis |
This is a large category of primary interest to compiler researchers
and a basis for manual code improvements. It includes call-graph information,
data use and access patterns, the control-flow structure, and results of
compiler analyses, (e.g., applied and failed optimizations and program
statistics.) |
| Simulation Analysis |
This is an open-ended category. Simulations can give insight into almost
all parameters of an application and its potential execution behavior on
a new or idealized machine. |
| Advanced model analysis |
Various advanced performance analysis and prediction models have been
proposed in the literature. They can give insight into a code's complex
behavior, sources of performance loss, and upper performance limits. |