Pritish Jetley
PhD Student
pjetley2 at illinois.edu
217-333-5827
Profile
I am a doctoral candidate in the Department of Computer Science at the University of Illinois, Urbana-Champaign, where I work with my advisor Prof. Laxmikant V. Kale. I received a B.Tech. in Computer Science and Engineering from the Indian Institute of Technology Guwahati in 2006. Below is a brief description of my research.
Parallel Programming Languages
My thesis is centered around specialized languages and frameworks that engender productive parallel programming without sacrificing performance. The first of these is Charisma, a language that expresses parallel programs in terms of global flows of control and data. Recent work on this language has focused on the addition of the neighbor list abstraction to the language. This feature allows the succinct expression of arbitrary and explicit flows between parallel objects. Work has also been done on performance optimization and the development of a compiler strategy that translates the global flows of Charisma into efficient object-local DAGs. I am currently developing incomplete languages and specialized frameworks for tree-based and tree-structured computations. These will be supplemented by a distributed array library that allows efficient bulk operations to provide a performance-oriented, disciplined shared address space abstraction. This kind of distributed array abstraction allows specialized optimizations that reduce the amount of data movement in array-based divide-and-conquer algorithms. In addition, I am writing a framework that allows the expression of lightweight, fine-grained tasks called microchares. Dependencies are specified through continuations. The underlying runtime system automatically agglomerates these tasks into work units of coarser grain, thereby yielding good performance.Application-Oriented Research
I also work on performance aspects of scientific applications. As part of the Cosmology group at PPL, I help to develop load balancing algorithms, data reuse components and efficient communication algorithms for the ChaNGa N-body simulator. Such contributions have enabled the code to scale to 32,000 cores and beyond. I also led the development of an adaptation of ChaNGa to a GPU-cluster environment. We achieved good scaling performance on hundreds of GPUs. Work is in progress to obtain bounds on performance and improve the efficiency of the code even further. I have been involved in performance studies of other applications as well. Chief among these is a Barnes-Hut code for gravity simulation. This work has yielded interesting insights into the load and communication profile of N-body codes. It also serves as a test-bed for new algorithms and approaches to the movement of data in the application. It currently scales to more than 16,000 cores of the Blue Gene/P supercomputer, and we continue to remove performance bottlenecks. This code was featured in our award-winning entry to the HPC Class II challenge. My work on kd-tree construction recently received the best paper award at HiPC '11. I am also working on a message-driven implementation of a de Bruijn graph based genome assembler for short-read, de novo sequencing with paired read information.High Performance Communication
In addition, I have contributed to the development of high performance communication layers for the Charm++ system. Previously, I helped with the development of a specialized one-sided communication framework called CmiDirect. We are currently developing a more general zero-copy one-sided messaging framework for Charm++. Here is a copy of my resume.Research Areas
Papers/Talks
11-49
2011
[Paper]
[Paper]
Charm++ for Productivity and Performance: A Submission to the 2011 HPC Class II Challenge [Supercomputing 2011]
11-27
2011
[Paper]
[Paper]
Optimizations for Message Driven Applications on Multicore Architectures [HiPC 2011]
11-25
2011
[Paper]
[Paper]
ParSSSE: An Adaptive Parallel State Space Search Engine [PPL 2011]
11-06
2011
[Paper]
[Paper]
Parallel Combinatorial Search [Encyclopedia of Parallel Computing 2011]
11-05
2011
[Paper]
[Paper]
An Adaptive Framework for Large-scale State Space Search [LSPP 2011]
11-01
2011
[Paper]
[Paper]
Architectural constraints to attain 1 Exaflop/s on three scientific application classes [IPDPS 2011]
10-27
2010
[Talk]
[Talk]
Static Macro Data Flow: Compiling Global Control into Local Control [HIPS 2010]
10-16
2010
[Paper]
[Paper]
Scaling Hierarchical N-Body Simulations on GPU Clusters [Supercomputing 2010]
10-10
2010
[Paper]
[Paper]
Static Macro Data Flow: Compiling Global Control into Local Control [HIPS 2010]
08-22
2008
[Talk]
[Talk]
Massively Parallel Cosmological Simulations with ChaNGa [IPDPS 2008]
08-11
2009
[Paper]
[Paper]
CkDirect: Unsynchronized One-Sided Communication in a Message-Driven Paradigm [P2S2 2009]
08-03
2008
[Paper]
[Paper]
Massively Parallel Cosmological Simulations with ChaNGa [IPDPS 2008]
07-09
2007
[Paper]
[Paper]
Toward Petascale Comological Simulations with ChaNGa [PPL Technical Report 2007]









