|
Program
Time |
Type |
Location |
Description |
|
Morning |
Tutorials (Session Chair - Eric Bohm) |
10:30 am - 12:00 pm |
Tutorial |
2405 Siebel |
Writing Your Own Charm Machine Layer
Dr. Sayantan Chakravorty
Click Here to Expand Description
Charm++ is a highly portable runtime system that runs on a myriad number of systems.
This is possible because all of Charm++ is built on top of a simple machine dependent layer
called the Machine Layer.
The Machine Layer has a well defined interface.
Charm++ can be ported to a different machine or architecture by implementing the Machine Layer interface
for that machine. This tutorial will explain how the machine layer works and will
show how a simple one can be implemented.
|
Noon |
|
Afternoon |
Tutorials (Session Chair - Eric Bohm) |
1:00 pm - 3:30 pm |
Tutorial |
2405 Siebel |
Basic Charm++ and Virtualization Tutorial
Prof. Laxmikant V. Kale and Kumaresh Pattabiraman
Click Here to Expand Description
This tutorial will give a basic introduction to Charm++. It will cover basic virtualization concepts,
proxies, basic use of arrays, messages and groups with examples.
|
3:30 pm - 3:45 pm |
|
Afternoon |
Welcome Address and Keynote (Session Chair - Prof. Kale) |
3:45 pm - 4:30 pm |
Talk |
2405 Siebel |
Charm++ and Affiliated Research: Recent Developments
Prof. Laxmikant V. Kale
|
4:30 pm - 5:15 am |
Keynote |
2405 Siebel |
Getting Ready for Exascale Computational Science
Prof. Rick Stevens
Associate Laboratory Director for Computing and Life Sciences, Argonne National Laboratory,
Professor of Computer Science, University of Chicago
Click Here to Expand Description
In this talk I'll outline a possible roadmap to exascale science. I'll discuss some
of the scientific problems that may motivate developing an exascale computational
capability and some of the challenges in developing applications and systems software
for systems that will exploit terascale integration VLSI. I'll discuss the challenges
of dramatically increasing levels of concurrency as exascale systems will likely
require applications concurrency of 10^9, a dramatic increase from todays largest
systems with programmer visible concurrency levels of 10^5. I'll discuss some of the
issues with programming models and I/O as well as the underlying technology challenges
such as advanced packaging and power management, optical interconnects, memory
architecture and ultra lower power designs for core logic and I/O. Finally I'll
discuss the role of future high-end systems in the overall computational ecosystem
that includes advanced networking, visualization and analysis systems, highly parallel
data architectures and emerging global sensor networks.
|
6:00 pm onwards |
|
|
8:45 am onwards |
|
Morning |
Technical Session (Session Chair - Prof. Kale) |
9:00 am - 9:45 am |
Invited Talk |
2405 Siebel |
The Evolution of MPI
Prof. William Gropp
Paul and Cynthia Saylor Professor of Computer Science,
University of Illinois at Urbana-Champaign
Click Here to Expand Description
The Message Passing Interface (MPI) has been a remarkably successful
programming model for parallel computers. However, ten years have
elapsed since the MPI 2.0 standard was released. Much has changed in
that time, including massive increases in node count,
experiences with new programming models, and the shift to multi- and
many-core processors. In particular, much more is known about the
performance costs and challenges in MPI implementations, particularly
for MPI one-sided operations, MPI I/O, and the hybrid programming
model of MPI and threads. The first part of this talk will summarize
some recent performance results for MPI implementations and discuss
the challenges that they represent. The second part will discuss how
the MPI Forum, the informal group that defined MPI and that has begun
meeting again, is addressing the evolution of MPI.
|
9:45 am - 10:15 am |
Talk |
2405 Siebel |
Parallel Wave Propagation and Topological Operators for Fragmentation Simulation
Prof. Glaucio H. Paulino
Donald Biggar Willett Professor of Engineering,
University of Illinois at Urbana-Champaign
Click Here to Expand Description
The presentation focuses on multiscale wave propagation problems. Elastic wave
propagates either inside a body or on surface. Shock wave generates
micro-cracks, and results in the macro-cracks growth. In order to investigate
such problems, an upgraded version of ParFUM (Parallel Framework for
Unstructured Meshes) is used on top of Charm++ so that we have a simple and
convenient environment for parallel programs which operate on unstructured
meshes, while achieving excellent scalability even for complex applications.
Dynamic crack microbranching processes are investigated by means of a
large-scale computational fracture mechanics approach using the finite element
method with special interface elements and a topological data structure
representation. The fracture events will be represented by interface elements
with tractions across the interface that follows a nonlinear cohesive model
driven by work conjugate displacement jumps. Special operators to reduce mesh
dependency are addressed, such as nodal perturbation and edge-swap. The
ultimate goal is to achieve physical fragmentation simulations involving
evolving mesh information when truly extrinsic cohesive elements are inserted
adaptively.
|
10:15 am - 10:30 am |
|
Morning |
Technical Session (Session Chair - Dr. Celso Mendes) |
10:30 am - 11:00 am |
Talk |
2405 Siebel |
Scaling Challenges in NAMD: Past and Future
Abhinav Bhatele
Click Here to Expand Description
NAMD is a portable parallel application for biomolecular simulations. NAMD pioneered
the use of hybrid spatial and force decomposition, a technique used now by most
scalable programs for biomolecular simulations, including Blue Matter and Desmond
developed by IBM and D. E. Shaw respectively. NAMD is developed using Charm++ and
benefits from its adaptive communication-computation overlap and dynamic load
balancing. This talk focuses on new scalability challenges in biomolecular simulations:
using much larger machines and simulating molecular systems with millions of atoms.
We describe new techniques we have developed to overcome these challenges. Since our
approach involves automatic adaptive runtime optimizations, one interesting issue
involves harmful interaction between multiple adaptive strategies, and how to deal
with them. Unlike most other molecular dynamics programs, NAMD runs on a wide variety
of platforms ranging from commodity clusters to supercomputers. It also scales to
large machines: we present results for up to 65,536 processors on IBM's Blue Gene/L
and 8,192 processors on Cray XT3/XT4 in addition to results on NCSA's Abe, SDSC's
DataStar and TACC's LoneStar cluster, to demonstrate efficient portability. Since our
IPDPS'06 paper two years ago, two new highly scalable programs named Desmond and Blue
Matter have emerged, which we compare with NAMD in this talk.
|
11:00 am - 11:30 am |
Talk |
2405 Siebel |
OpenAtom: New Science and Ongoing Research
Dr. Glenn J. Martyna
Physical Sciences Division, IBM T. J. Watson Research Center
Click Here to Expand Description
OpenAtom is the production release of LeanCP, a quantum chemistry application
which implements the CPAIMD method. This talk will cover new science discoveries
being made using OpenAtom and recent performance of this code on new machines.
|
11:30 am - 12:00 pm |
Talk |
2405 Siebel |
ChaNGa: Charm N-body GrAvity solver
Prof. Thomas R. Quinn
Department of Astronomy, University of Washington
Click Here to Expand Description
Simulations of galaxies forming in their cosmological context poses a
number of challenges to performance on large parallel machines. The
first is the very non-local nature of gravitational forces. Galaxies
are influenced by the gravitational forces originating tens of
megaparsecs away, requiring significant communication in the force
solver. Second is the enormous spatial dynamic ranges involved, from
megaparsecs to sub-parsec scales, requiring dynamic hierarchical data
structures. Third is the vast time scales involve, from less than 1
million years to the age of the Universe, posing significant
challenges for load balancing. This talk will present how these
challenges have been addressed in the design of ChaNGa, the Charm
N-body GrAvity solver.
|
Noon |
|
Afternoon |
Technical Session (Session Chair - Dr. Terry Wilmarth) |
1:00 pm - 1:30 pm |
Talk |
2405 Siebel |
Charm++ on Heterogeneous Systems: Cell processor and GPGPU
David Kunzman and Lukasz Wesolowski
Click Here to Expand Description
Accelerators such as Graphical Processing Units (GPUs) and specialized
cores, such as the Synergistic Processing Elements (SPEs) on the Cell
processor, are being used with greater frequency in the realm of parallel
computing to speedup computationally heavy portions of code. These systems are
comprised of multiple types of processing elements, each with unique
characteristics, strengths, weaknesses, and programming paradigms. Developing
applications can be challenging since many architectural details must be taken
into account. In this talk we will summarize the ongoing efforts to allow the
Charm++ Runtime System to utilize accelerators while trying to abstract away as
many architectural details as possible. Specifically, we will cover work
related to the Cell processor and GPUs.
|
1:30 pm - 2:00 pm |
Talk |
2405 Siebel |
ParFUM and its Applications
Aaron Becker
Click Here to Expand Description
ParFUM is a framework for writing unstructured mesh codes in Charm++.
In this talk I will describe the software architecture of ParFUM and discuss
the services it provides to applications, including partitioning, load
balancing, and adaptivity. I will also give a brief overview of successful
ParFUM applications and discuss their performance and scalbility.
|
2:00 pm - 2:30 pm |
Talk |
2405 Siebel |
BigSim
Dr. Gengbin Zheng
Click Here to Expand Description
PetaFLOPS-class computers are currently being developed and even larger computers are being planned. Our BigSim project is aimed at developing tools that allow one to develop, debug and tune/scale/predict the performance of applications before such machines are available so that the applications can be ready when the machine first becomes operational. It also allows easier "offline" experimentation of parallel performance tuning strategies --- without using the full parallel computer. To the machine architects, BigSim provides a method for modeling the impact of architectural choices (including the communication network) on actual, full-scale applications. In this talk, we will present our simulation framework which consists of an emulator and a simulator; we will focus on the recent progress in integrating instruction level simulation with our framework, and out-of-core emulation support.
|
2:30 pm - 3:00 pm |
Talk |
2405 Siebel |
AMPI, Charisma and MSA
Dr. Celso L. Mendes and Pritish Jetley
Click Here to Expand Description
In this talk, we will cover three different programming paradigms that
leverage the Charm++ programming system. The first one is Adaptive MPI
(AMPI), an implementation of the popular MPI standard. AMPI is based on
Charm++, and implements the traditional MPI tasks with user-level
migratable threads. Thus, AMPI provides advanced features such as
dynamic load balancing and automatic overlap between computation and
communication to traditional MPI codes. Porting legacy MPI codes to AMPI
typically involves no change to the sources. We will review AMPI's
basic features, and discuss its current status. We will also talk about
Charisma, a parallel object orchestration language that explicitly
specifies global flow of control and data. We will proceed to discuss
MSA, which encourages the disciplined use of shared address
spaces. We will also talk about how these languages fit into our
research agenda of creating various incomplete languages that are
targeted towards specific programming paradigms and which are
interoperable over a common ARTS, thereby promoting productivity.
|
3:00 pm - 3:15 pm |
|
Afternoon |
Technical Session (Session Chair - Dr. Gengbin Zheng) |
3:15 pm - 3:45 pm |
Talk |
2405 Siebel |
Debugging Tools for Charm++ Applications
Filippo Gioachin
Click Here to Expand Description
In this talk I will discuss some of the more recent features added to
CharmDebug, the Charm++ debugger. Emphasis will be given to introspection and
memory debugging.
A new Python module allows the user to upload fragments of Python code into the
application being analyzed. This uploaded code can access the application's data
and perform introspection, for example by checking whether some data is within a
validity range.
The combination of CharmDebug with a memory module inside Charm++ allows the
debugger to collect various information about the status of the memory, such as
Allocation Trees and Allocation Graphs. In addition, some operations can be
performed on the allocated memory, such as leak detection and cross-object
corruption.
|
3:45 pm - 4:15 pm |
Talk |
2405 Siebel |
LuaCharm: Implementing Chares in a High-Level Scripting Language
Thiago Ponte
Pontifícia Universidade Católica do Rio de Janeiro
Click Here to Expand Description
Over the last years, scripting languages have been
gaining importance in many applications. One area in which these
languages have not been much explored is parallel programming.
Parallel programming has always been strongly associated with
scientific usage, but has recently gained new fields of action. With
this change, the development of new programming paradigms
of parallel programming becomes necessary in order to make
development easier and applications more dynamic. Scripting
languages may come into play here, bringing dynamism, flexibility
and simplicity to applications. In this paper we describe
the integration of the Lua scripting language into the Charm++
framework. With the resulting binding, any distributed object
(chare) can be written either in Lua or in C++.
|
4:15 pm - 4:45 pm |
Talk |
2405 Siebel |
Noise Miner
Isaac Dooley
Click Here to Expand Description
This talk describes a new scalable stream mining algorithm
called NoiseMiner that analyzes parallel application traces to
detect computational noise, operating system interference,
software interference, or other irregularities in a parallel
application's performance. The algorithm detects these
occurrences of noise during real application runs, whereas
standard techniques for detecting noise use carefully crafted
test programs to detect the problems. This paper concludes by
showing the output of NoiseMiner for a real-world case in which
6 ms delays, caused by a bug in an MPI implementation,
significantly limited the performance of a molecular dynamics
code on a new supercomputer.
|
4:45 pm - 5:25 pm |
Talk |
2405 Siebel |
Works in
Progress: Ongoing Research in Converse and Charm++
Chao Mei and Eric Bohm
Click Here to Expand Description
This talk is split into two sections covering recent work to optimize
Charm++ for new and popular architectures. The first section, delivered by
Chao Mei, will focus on shared memory multicore machines. The second
section, delivered by Eric Bohm, will introduce the new CkDirect API for RDMA, and
discuss preliminary results of its use on Infiniband and Blue Gene/P.
|
|
8:45 am onwards |
|
Morning |
Tutorials (Session Chair - Dr. Celso Mendes) |
9:00 am - 10:30 am |
Tutorial |
2405 Siebel |
ParFUM Tutorial
Dr. Terry L. Wilmarth
Click Here to Expand Description
This tutorial will describe how to write simulations on unstructured meshes using ParFUM. Topics covered will include:
basic structure of a ParFUM program, manipulating data in ParFUM,
virtualization, building and running ParFUM programs, using mesh adaptivity
and other new features of ParFUM.
|
10:30 am - 12:30 pm |
Tutorial |
2405 Siebel |
BigSim Tutorial
Eric Bohm
Click Here to Expand Description
This tutorial will describe how to build and use the BigSim
large machine simulator. Topics covered will include:
building applications on BigSim, intrumenting applications
with the BigSim LogAPI, use of the interpolation tool to
integrate results from other simulators, use of BigNetSim,
and the implementation of new interconnects in BigNetSim.
|
|