|MADbench2 is a tool for testing
the overall integrated performance of the I/O,
communication and calculation subsystems of massively parallel
under the stresses of a real scientific application.
MADbench2 is based on the MADspec code, which calculates the maximum likelihood angular power spectrum of the Cosmic Microwave Background radiation from a noisy pixelized map of the sky and its pixel-pixel noise correlation matrix.
MADbench2 retains the full computational complexity of its parent scientific application code, but uses self-generated pseudo-data to allow the myriad computationally irrelevant details associated with handling real CMB datasets to be by-passed.
MADbench2 can be run in two modes:
(i) regular mode, in which the full code is run.
(ii) IO mode, in which all calculation/communication is replaced with busy-work.
In addition, MADbench2 can be run as single- or multi-gang; in the former all the matrix operations are carried out distributed over all of the processors, whereas in the latter the matrices are built, summed and inverted over all the processors (S & D), but then redistributed over subsets of processors (gangs) for their subsequent manipulations (W & C). This gang-parallelism allows the data to be dense on the processors for the dominant matrix-matrix multiplication (W) phase even with very large numbers of processors.
To run in regular mode, MADbench2 needs to be linked to the ScaLAPACK & LAPACK libraries and their dependencies (BLAS, PBLAS, BLACS). The MADbench2.h file contains system-specific definitions and declarations; this file should be augmented as needed and the code compiled with -D SYSTEM.
To run in IO mode, MADbench2 should be compiled with -D IO (in addition to -D SYSTEM) whereupon all of the library calls are redefined to busy-work so that none of the libraries are needed.
Running MADbench2 requires:
In addition, MADbench2 requires 5 x NO_PIX2 x 8 bytes of memory per gang.
All mallocs and IO calls are explicitly checked for success and MADbench2 aborts if any one fails.
In case of failure, the processor ID and attempted action are reported before exiting.
MADbench2 reports the mean, minimum and maximum times spent in calculation/communication, busy-work, reading and writing in each function.
In addition, the first element of the MADspec solution vector is reported to check that the code performed correctly. In full mode, NO_PIX = 5000 & NO_BIN = 4 should return dC = -9.22431e-01; IO mode always returns dC = 0.00000.