| Simulating PetaFLOPS Supercomputers
|  |
|
PetaFLOPS-class computers are being developed currently, to be deployed
in 2007-2008, and even larger computers are being planned (such as
those developed under DARPA HPCS program, DOE LCF, and NSF's Petascale
initiative). The BigSim project is aimed at developing tools that
allow one to develop, debug and tune/scale/predict the performance of
applications before such machines are available, so that the applications can be
ready when the machine first becomes operational. It also allows easier
"offline" experimentation of parallel performance tuning strategies
--- without using the full parallel computer. To the machine
architects, BigSim provides a method for modeling the impact of
architectural choices (including the communication network) on actual,
full-scale applications. The BigSim system consists of an emulator and a
simulator.
BigSim Emulator
The BigSim Emulator can take any Charm++ or AMPI program (
AMPI) is an MPI
implementation)
and "run" it on a specified number of
processors (P) using the processors (Q) available to the emulator. For
example, one can run an MPI program meant for P=100,000 processors using
only Q=2,000 available processors. If the memory requirements of
the application are larger than available memory on the Q processors, the emulator employs
a (recently developed) built-in out-of-core execution scheme that uses
the file system to store the processor's data when not being executed.
The emulator can be used to test and debug an application, especially
for scaling bugs (such as a data structure of size P*P, where P is the
number of processors). One can monitor memory usage, data values and
output, debug for correctness, address algorithmic scaling issues such
as convergence of numerical schemes, and operation counts for
operations at full scale.
The emulator can also be used to generate traces that are used for
coarse timing predictions and for identification of performance
bottlenecks, with a parallel discrete event simulator called BigSim Simulator.
BigSim Simulator
The BigSim Simulator is a trace-driven parallel discrete event simulator that models
architectural parameters of the target machine, including (optionally)
a detailed model of the communication network. It can be used to
identify potential performance bottlenecks for the simulated
application such as load imbalances, communication contention and long
critical paths. It generates performance traces just as a real program
running on the target machine would, allowing one to carry out normal
performance visualization and analysis.
For predicting performance of sequential code segments, the simulator
allows a variable-resolution model, ranging from simple scale factors
to interpolation based on performance counters (and possibly
cycle-accurate simulators).
For analyzing performance of communication networks, one can plug in
either a very simple latency model, or a detailed model of the entire
communication fabric. The fact that the simulator is parallel allows
it to run very large networks.
This research was supported in part by an NSF NGS grant (Award# NSF
0103645).
|
|
Software |
| BigSim is capable of simulating a broad class of machines.
It consists of an emulator and a simulator.
A BigSim installation and usage manual can be found
here.
- BigSim Emulator
-
The BigSim software API definition is here.
The BigSim Emulator is implemented on Converse - the interoperable runtime system as
a part of the Charm++ system. The portability of Converse ensures
that the emulator can run on almost all existing parallel machines and
clusters. For more information, see the emulator paper in the Papers section
below.
The emulator is also capable of carrying out a simulation with timing data via
a novel time-stamp correction mechanism, with a simple model for communication
latencies (02-03).
- BigSim Simulator
- BigSim simulator (sometimes
called BigNetSim) is a parallel simulator which is built with POSE, a parallel
discrete event simulation environment developed at PPL using Charm++. The
simulator is an effort to simulate large current and future computer systems to
study the behavior of applications developed for those systems. It simulates
with reasonable detail an integrated model for computation (processors) and
communication (interconnection networks). More details on BigNetSim can be
found here.
- Download and build instructions:
-
The BigSim system has been integrated into Charm++.
A program can be written in either low level BigSim Machine API, or
in Charm++. Both need to be compiled and linked with Charm++ and BigSim
emulator libraries.
To download Charm++ (including the BigSim pieces),
go to the
Charm++ download page.
To build Charm++ and BigSim Simulator,
use "bigsim" as option to the Charm++ "build" script.
For example, to compile on a Linux box, type:
./build bigsim net-linux bigsim
Sample programs written in low level BigSim machine API
can be found under charm/examples/bigsim; and sample programs written
in Charm++ at charm/examples/bigsim/emulator/littleMD.
A complete manual for building and running the BigSim Simulator
is under development and will be made available soon.
- Applications:
-
NAMD - Molecuar
dynamics simulations of biomolecules
-
Protein Folding on Peta-FLOP class machines is another ongoing reseach in our group. LeanMD,
developed in this research, is one of the important real-world applications
on our BigSim Simulator.
|
|
People |
|
|
|
Papers |
- 05-11
Nilesh Choudhury and Yogesh Mehta and Terry L. Wilmarth and Eric J. Bohm and Laxmikant V. Kale, Scaling an Optimistic Parallel Simulation of Large-scale Interconnection Networks, Proceedings of the Winter Simulation Conference
- 05-06
Gengbin Zheng, Achieving High Performance on Extremely Large Parallel Machines: Performance Prediction and Load Balancing, Ph.D. Thesis, Department of Computer Science, University of Illinois at Urbana-Champaign, 2005
- 05-03
Terry L. Wilmarth, Gengbin Zheng, Eric J. Bohm, Yogesh Mehta, Nilesh Choudhury, Praveen Jagadishprasad and Laxmikant V. Kale, Performance Prediction using Simulation of Large-scale Interconnection Networks in POSE , In Proceedings of the Workshop on Principles of Advanced and Distributed Simulation, 2005, pp. 109-118.
- 04-12
Gengbin Zheng, Terry Wilmarth, Praveen Jagadishprasad and Laxmikant V. Kale, Simulation-Based Performance Prediction for Large Parallel Machines, International Journal of Parallel Programming 2005
- 04-02
Gengbin Zheng, Terry Wilmarth, Orion Sky Lawlor, Laxmikant V. Kale, Sarita Adve, David Padua, Philippe Geubelle, Performance Modeling and Programming Environments for Petaflops Computers and the Blue Gene Machine, In proceedings of Next Generation Systems (NGS) Workshop, IPDPS 2004, IEEE Press, page 197.
- 03-05
Gengbin Zheng, Gunavardhan Kakulapati, Laxmikant V. Kale, BigSim: A Parallel Simulator for Performance Prediction of Extremely Large Parallel Machines, IPDPS 2004
- 02-03
Gengbin Zheng, Arun Kumar Singla, Joshua Mostkoff Unger, L. V. Kale, A Parallel-Object Programming Model for Petaflops Machines and Blue Gene/Cyclops, Next Generation Systems Program Workshop, IPDPS 2002
- 01-04
Neelam Saboo, Arun Kumar Singla, Joshua Mostkoff Unger, L.V. Kale, Emulating Petaflops Machines and Blue Gene, Workshop on Massively Parallel Processing (IPDPS'01)
|
|
Related Links |
|
|
|
|