Gengbin Zheng 4006E NCSA 1205 W Clark St Urbana, IL 61801 (217)3003023(o) gzheng@illinois.edu http://charm.cs.uiuc.edu/people/gzheng Research Interests • Parallel computing • Parallel programming languages and paradigms • Adaptive parallel runtime systems • Dynamic load balancing • Performance prediction of parallel applications • Fault Tolerance • Molecular dynamics simulation Education • Ph.D., Computer Science, December 2005, University of Illinois at Urbana-Champaign, Urbana, IL ; GPA 4.0 • M.S., Computer Science, September 1998, Beijing University, China (with honors) ; GPA 3.8 • B.S., Computer Science, September 1995, Beijing University, China; GPA 3.6 Awards 1. HPC Challenge class 2 Award, SC 2011, Seattle, WA 2. Gordon Bell Award for special accomplishment in NAMD paper, SC 2002, Baltimore, MD. 3. GuangHua scholarship, Study scholarship, Beijing University, 1997 4. LianXiang scholarship, Beijing University, 1996 5. LianXiang scholarship, Beijing University, 1995 6. Study scholarship, Beijing University, 1993, 1994 Professional Experience 1. Senior Research Scientist, NCSA, University of Illinois at Urbana-Champaign, 7/2012-now 2. Research Scientist, Computer Science Department, University of Illinois at Urbana-Champaign, 2010-7/2012 3. Research Scientist, Center for Simulation of Advanced Rockets, University of Illinois at Urbana-Champaign, 2008-2010 4. Postdoctoral Research Associate, Center for Simulation of Advanced Rockets, University of Illinois at Urbana-Champaign, 2005-2008 5. Graduate Research Assistant, Computer Science Department, University of Illinois at Urbana-Champaign, 1999-2005 6. Intern, IBM T.J. Watson Research Center, Yorktown Heights, NY, summer 2001 - involved in BlueGene/L (World's fastest machine) project 7. Intern, Silicon Graphics, Inc (SGI), in the compiler group, Mountain View, CA, summer 2000 8. Teaching Assistant, Computer Science Department, University of Arizona, 1998 9. Research assistant, Computer Science Department, Beijing University, 1995-1998 Publications Papers in Journals and Book Chapter 1. Harshitha Menon, Lukasz Wesolowski, Gengbin Zheng, Pritish Jetley, Laxmikant Kale, Thomas Quinn and Fabio Governato, ``Adaptive Techniques for Clustered N-Body Cosmological Simulations'', Astrophysics - Instrumentation and Methods for Astrophysics, D.1.3, J.2, 2014 2. Esteban Meneses, Xiang Ni, Gengbin Zheng, Celso L. Mendes and Laxmikant V. Kale,``Using Migratable Objects to Enhance Fault Tolerance Schemes in Supercomputers'', IEEE Transactions on Parallel and Distributed Systems, 2014 3. Yanhua Sun, Gengbin Zheng, Pritish Jetley and Laxmikant V. Kal\'e,``ParSSSE: An Adaptive Parallel State Space Search Engine'', Parallel Processing Letters, 2011 4. Aaron Becker, Gengbin Zheng, and Laxmikant Kale, ``Distributed Memory Load Balancing'', book chapter in Encyclopedia of Parallel Computing, David Padua, Ed., 2011 5. Gengbin Zheng, Abhinav Bhatele, Esteban Meneses and Laxmikant V. Kale; ``Periodic Hierarchical Load Balancing for Large Supercomputers'';in International Journal of High Performance Computing, 2010 6. Laxmikant V. Kale and Gengbin Zheng, ``Charm++ and AMPI: Adaptive Runtime Strategies via Migratable Objects''; book chapter, in Advanced Computational Infrastructures for Parallel and Distributed Applications (Wiley-Interscience), 2009 7. Gengbin Zheng, Hari Govind, Michael S. Breitenfeld, Hari Govind, Philippe Geubelle, Laxmikant V. Kale, ``Automatic Dynamic Load Balancing for a Crack Propagation Application'', submitted to the International Journal of High Performance Computing Applications 8. Gengbin Zheng, Chao Huang, Laxmikant V. Kale, ``Performance Evaluation of Automatic Checkpoint-based Fault Tolerance for AMPI and Charm++'', SIGOPS Operating System Review Special Issue on HEC OS/Runtimes, 2006 9. Xiangmin Jiao, Gengbin Zheng, Phillip A. Alexander, John Norris , Michael T. Campbell , Andreas Haselbacher, Michael T. Heath, ``A system integration framework for coupled multiphysics simulations'', special issue of "Engineering with Computers" on frameworks/integrated software infrastructures for scalable scientific and engineering applications, 2006 10. Orion S. Lawlor, Sayantan Chakravorty, Terry L. Wilmarth, Nilesh Choudhury, Isaac Dooley, Gengbin Zheng and Laxmikant V. Kal, ``ParFUM: A Parallel Framework for Unstructured Meshes for Scalable Dynamic Physics Applications'', special issue of "Engineering with Computers" on frameworks/integrated software infrastructures for scalable scientific and engineering applications, 2006 11. Laxmikant V. Kale, Klaus Schulten, Robert D. Skeel, Glenn Martyna, Mark Tuckerman, James C. Phillips, Sameer Kumar, and Gengbin Zheng, ``Biomolecular modeling using parallel supercomputers'', book chapter, In S. Aluru, editor, Handbook of computational molecular biology, pp. 34.1-34.43. Taylor and Francis, 2005 12. Gengbin Zheng, Terry Wilmarth, Praveen Jagadishprasad, Laxmikant V. Kal, ``Simulation-Based Performance Prediction for Large Parallel Machines'', International Journal of Parallel Processing, 2005 13. Laxmikant V. Kale, Eric Bohm, Celso L. Mendes, Terry Wilmarth, Gengbin Zheng; ``Programming Petascale Applications with Charm++ and AMPI''; in Petascale Computing: Algorithms and Applications, CRC Press, 2004 14. Laxmikant V. Kal, Gengbin Zheng, Chee Wai Lee, Sameer Kumar, ``Scaling Applications to Massively Parallel Machines Using Projections Performance Analysis Tool'', Future Generation Computer Systems, Journal, 2004 Papers in Conferences and Workshops 1. Phil Miller, Michael Robson, Bassil El-Masri, Rahul Barman, Gengbin Zheng, Atul Jain and Laxmikant Kale, ``Scaling the ISAM Land Surface Model Through Parallelization of Inter-Component Data Transfer'', in the 43rd International Conference on Parallel Processing (ICPP), 2014 2. Emmanuel Jeannot, Esteban Meneses-Rojas, Guillaume Mercier, Francois Tessier and Gengbin Zheng, ``Communication and Topology-aware Load Balancing in Charm++ with TreeMatch '', in Proceedings IEEE International Conference on Cluster Computing 2013, Indianapolis, IN, 2013 3. Yanhua Sun, Gengbin Zheng, Chao Mei, Eric Bohm, James Phillips, Terry Jones and Laxmikant Kale, ``Optimizing Fine-grained Communication in a Biomolecular Simulation Application on Cray XK6'', in Proceedings of the 2012 ACM/IEEE conference on Supercomputing, 2012 4. Harshitha Menon, Nikhil Jain, Gengbin Zheng and Laxmikant Kale, ``Automated Load Balancing Invocation based on Application Characteristics'', in Proceedings IEEE International Conference on Cluster Computing 2012, Beijing China 5. Gengbin Zheng, Xiang Ni and L. V. Kale,``A Scalable Double In-memory Checkpoint and Restart Scheme towards Exascale'',in Proceedings of the 2nd Workshop on Fault-Tolerance for HPC at Extreme Scale (FTXS), Boston, 2012 6. Yanhua Sun, Gengbin Zheng, Ryan Olson, Terry Jones, Laxmikant Kale, ``A uGNI-Based Asynchronous Message-driven Runtime System for Cray Supercomputers with Gemini Interconnect'', in IEEE International Parallel and Distributed Processing Symposium (IPDPS), Shanghai, China, 2012 7. Ehsan Totoni, Abhinav Bhatele, Eric Bohm, Nikhil Jain, Celso Mendes, Ryan Mokos, Gengbin Zheng and Laxmikant Kale, ``Simulation-based Performance Analysis and Tuning for the Planned Blue Waters System'', Proceedings of the 16th International Conference on Parallel and Distributed Systems (ICPADS), 2011 8. Gengbin Zheng, Stas Negara, Celso L. Mendes, Eduardo R. Rodrigues and Laxmikant V. Kale, ``Automatic Handling of Global Variables for Multi-threaded MPI Programs'', Proceedings of the 16th International Conference on Parallel and Distributed Systems (ICPADS), 2011 9. Chao Mei and Yanhua Sun and Gengbin Zheng and Eric J. Bohm and Laxmikant V.~Kal{\'e} and James C.Phillips and Chris Harrison,``Enabling and Scaling Biomolecular Simulations of 100~Million Atoms on Petascale Machines with a Multicore-optimized Message-driven Runtime'', Proceedings of the 2011 ACM/IEEE conference on Supercomputing, November, 2011 10. Yanhua Sun, Gengbin Zheng, Pritish Jetley and Laxmikant V. Kale, "An Adaptive Framework for Large-scale State Space Search", Proceedings of Workshop on Large-Scale Parallel Processing (LSPP) in IEEE International Parallel and Distributed Processing Symposium (IPDPS), Anchorage (Alaska), May, 2011 11. Abhishek Gupta, Gengbin Zheng and Laxmikant V. Kale, ``A Multi-level Scalable Startup for Parallel Applications'', Proceedings of International Workshop on Runtime and Operating Systems for Supercomputers, May, 2011 12. Gengbin Zheng, Gagan Gupta, Eric Bohm, Isaac Dooley, and Laxmikant V. Kale, "Simulating Large Scale Parallel Applications using Statistical Models for Sequential Execution Blocks", in the Proceedings of the 16th International Conference on Parallel and Distributed Systems (ICPADS 2010), Shanghai, China, 2010 13. Filippo Gioachin, Gengbin Zheng, and Laxmikant V. Kale; ``Debugging Large Scale Applications in a Virtualized Environment''; in the Proceedings of the 23rd International Workshop on Languages and Compilers for Parallel Computing (LCPC2010), Houston, TX, USA, October, 2010 14. Chao Mei, Gengbin Zheng, Filippo Gioachin and Laxmikant V. Kale; ``Optimizing a Parallel Runtime System for Multicore Clusters: A Case Study''; in Proceedings of TeraGrid'10, Pittsburgh, PA, USA, August, 2010 15. Filippo Gioachin, Gengbin Zheng and Laxmikant V. Kale; ``Robust Record-Replay with Processor Extraction''; in Proceedings of the Workshop on Parallel and Distributed Systems: Testing, Analysis, and Debugging (PADTAD - VIII), 2010 16. Gengbin Zheng, Esteban Meneses, Abhinav Bhatele and Laxmikant V. Kale, ``Hierarchical Load Balancing for Large Scale Supercomputers''; in Proceedings of the Third International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2), 2010 17. Abhinav Bhatele, Sameer Kumar, Chao Mei, James C. Phillips, Gengbin Zheng, Laxmikant V. Kale; ``Overcoming Scaling Challenges in Biomolecular Simulations across Multiple Platforms''; to appear in Proceedings of IEEE International Parallel and Distributed Processing Symposium 2008 18. Sameer Kumar, Chao Huang, Gengbin Zheng, Eric Bohm, Abhinav Bhatele, James C. Phillips, Hao Yu, Laxmikant V. Kale; ``Scalable Molecular Dynamics with NAMD on Blue Gene/L''; to appear in IBM Journal of Research and Development 2007 19. David Kunzman, Gengbin Zheng, Eric Bohm, Laxmikant V. Kale; ``Charm++, Offload API, and the Cell Processor''; in PMUP Workshop at PACT'06, September 2006 20. Gengbin Zheng, Orion Sky Lawlor, Laxmikant V. Kale, ``Multiple Flows of Control in Migratable Parallel Programs'', to appear in The 8th Workshop on High Performance Scientific and Engineering Computing (HPSEC), 2006 21. Laxmikant Kale, Issaac Dooley, and Gengbin Zheng, ``Handling OS Interference Via Migratable Message-Driven Objects'', minisymposium, SIAM Conference on Parallel Processing for Scientific Computing, San Francisco, CA, 2006 22. Chao Huang, Gengbin Zheng, Sameer Kumar, Laxmikant V. Kale, ``Performance Evaluation of Adaptive MPI'', ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2006 23. Xiangmin Jiao , Gengbin Zheng , Orion Lawlor , Phil Alexander , Mike Campbell , Michael Heath , Robert Fiedler, ``An Integration Framework for Simulations of Solid Rocket Motors'', 41st AIAA/ASME/SAE/ASEE Joint Propulsion Conference, July 10--13, 2005, Tucson, Arizona 24. Terry L. Wilmarth, Gengbin Zheng, Eric J. Bohm, Yogesh Mehta, Praveen Jagadishprasad, Laxmikant V. Kal, ``Performance Prediction using Simulation of Large-scale Interconnection Networks in POSE'', 19th ACM/IEEE/SCS Workshop on Principles of Advanced and Distributed Simulation (PADS 2005) 25. Gengbin Zheng, Lixia Shi, Laxmikant V. Kal, ``FTC-Charm++: A Parallel In-Memory Checkpoint-Based Fault Tolerant Runtime for Parallel Systems'', Cluster Computing 2004, San Diego, California. 26. Gengbin Zheng, Gunavardhan Kakulapati, Laxmikant V. Kal, “ BigSim: A Parallel Simulator for Performance Prediction of Extremely Large Parallel Machines”, in 18th International Parallel and Distributed Processing Symposium(IPDPS), 2004 27. Gengbin Zheng, Terry Wilmarth, Orion Sky Lawlor, Laxmikant V. Kal, Sarita Adve, David Padua, Philippe Guebelle, ``Performance Modeling and Programming Environments for Petaflops Computers and the Blue Gene Machine'', Next Generation Systems Program Workshop, 18th International Parallel and Distributed Processing Symposium(IPDPS), 2004 28. Laxmikant V. Kal, Sameer Kumar, Gengbin Zheng, Chee Wai Lee, ``Scaling Molecular Dynamics to 3000 Processors with Projections: A Performance Analysis Case Study'', Terascale Performance Analysis Workshop, International Conference on Computational Science(ICCS), 2003 29. James Phillips, Gengbin Zheng, Sameer Kumar, Laxmikant V. Kal , “NAMD: Biomolecular Simulation on Thousands of Processors”. SC2002, Baltimore, MD, Gordon Bell Award winner paper. 30. James Phillips, Gengbin Zheng, Laxmikant V. Kal, “NAMD: Biomolecular Simulation on Thousands of Processors” in Workshop: Scaling to New Heights, 2002, Pittsburgh Supercomputing Center. 31. Gengbin Zheng, Arun Singla, Joshua Unger, Laxmikant V. Kal, “ A Parallel-Object Programming Model for PetaFLOPS Machines and Blue Gene/Cyclops” in Next Generation Systems Program Workshop, 16th International Parallel and Distributed Processing Symposium(IPDPS), 2002. 32. Zhihui Du, Wenkui Ding, Gengbin Zheng, Xiaoming Li, Zhuoqun Xu, “Research and Implementation of an HPF Compilation System”, Ruan Jian Xue Bao/Journal of Software, 10(1), pp. 60-67, 1999. 33. Hua Xiang, Gengbin Zheng, Lixia Shi, Jianping Wang, Zhuoqun Xu, “Performance Analysis on DAWN with p_HPF Compiler System”, Proceedings of DAWN User's Conference, October, 1998. Thesis 1. Gengbin Zheng, ``Achieving High Performance on Extremely Large Parallel Machines: Performance Prediction and Load Balancing'', Ph.D. Thesis, Department of Computer Science, University of Illinois at Urbana-Champaign, 2005 2. Gengbin Zheng, “The Key Technologies and Optimizations in an Implementation of Data Parallel Language: HPF”. M.S. Thesis, Dept. of Computer Science, Beijing University, 1998. Talks and Posters 1. Gengbin Zheng, ``Scaling Fault Tolerant Applications using Migratable Objects in Charm++'', CHANGES workshop, Beijing, China, 2014 2. Gengbin Zheng,``Parallel Runtimes for Achieving High Performance on Large Parallel Machines'', seminar,Brookhaven National Laboratory,Feb 2010 3. David M. Kunzman, Gengbin Zheng, Eric Bohm, James C. Phillips, Laxmikant V. Kale,``Charm++ Simplifies Coding for the Cell Processor'', poster,SC 2006,Tampa,FL 4. Hari Govind, Gengbin Zheng, Laxmikant Kale, Michael Breitenfeld, Philippe Geubelle, ``Speeding up Parallel Simulation with Automatic Load Balancing''', poster, SC 2005, Seattle, WA 5. Gengbin Zheng, ``Basic Charm++ and Load Balancing'', LACSI Symposium 2005, Santa Fe, NM 6. Gengbin Zheng, ``Fault Tolerance in Charm++'', LACSI Symposium 2005, Santa Fe, NM 7. Gengbin Zheng, ``Advanced Charm++ Tutorial'', Charm++ workshop, University of Illinois, 2005, Urbana, IL 8. Gengbin Zheng, ``An Integration Framework for Simulations of Solid Rocket'', AIAA, 2005 Tucson, AZ 9. Gengbin Zheng, ``FTC-Charm++: An In-Memory Checkpoint-Based Fault Tolerant Runtime for Charm++ and MPI'', Cluster, 2004, San Diego, CA 10. Gengbin Zheng, ``BigSim Tutorial'', Charm++ workshop, University of Illinois, 2004, Urbana, IL 11. Gengbin Zheng, BigSim: A Parallel Simulator for Performance Prediction of Extremely Large Parallel Machines, IPDPS 2004, Santa Fe, NM 12. Gengbin Zheng, ``Charm++ Load Balancing Framework'', Charm++ workshop, University of Illinois, 2003, Urbana, IL 13. L. V. Kale, Gengbin Zheng, Terry Wilmarth, ``BigSim Simulator'', Poster, BlueGene Workshop, 2003, Reno, NV 14. L. V. Kale, Sayantan Chakravorty, Gengbin Zheng, ``Runtime Support for BlueGene'', Poster, BlueGene Workshop, 2003, Reno, NV 15. Gengbin Zheng, ``Parallelizing FP-growth Frequent Patterns Mining Algorithm Using OpenMP'', Intel Corporation, 2002, Urbana, IL 16. Gengbin Zheng, ``A Parallel-Object Programming Model for PetaFLOPS Machines and BlueGene/Cyclops'', IPDPS, 2002 Fort Lauderdale, FL 17. Gengbin Zheng, Arun Singla, Joshua Unger, Laxmikant Kalé, ``Blue Gene Simulator and Charm++'', poster, SC 2002, Baltimore, MD 18. Gengbin Zheng, ``Charm++ on Blue Gene/C'', Charm++ workshop, University of Illinois, 2002, Urbana, IL 19. Gengbin Zheng, ``Exploiting the I/O processors in Bluelight'', Poster, IBM T.J.Watson Research Center, 2001, NY Participated Projects • CHANGES Seed Project, Principle Investigator, Scalable and Fault Tolerant Hetergeneous Computing Cosmology Software with Object Based Model • NSF EAGER Project, co-PI, Using PDE Descriptions to Generate Code Precisely Tailored to Energy-constrained Systems Including Large GPU Accelerated Clusters • Blue Waters Project --- Many Charm++-related projects, including porting and scaling Charm++ and NAMD application on Cray supercomputer with Gemini interconnect • NIH --- NAMD parallel molecular dynamics simulation code • NSF Next Generation Software (NGS) --- BigSim performance prediction for petaflops scale parallel machines • DOE --- parallel rocket simulation code developed at the Center for Simulation of Advanced Rockets (CSAR) funded by the DOE as part of its Advanced Simulation and Computing (ASCI) program Research and Industrial Projects • Blue Waters Project, 07 - present The Blue Waters project aims at delivering a Cray supercomputer capable of sustained performance of 1 petaflop on a range of real-world science and engineering applications. It is expected to be one of the most powerful supercomputers in the world. I have been working on this project since the project started. My work includes using BigSim performance simulator to predict the performance of parallel applications (including NAMD) on future Blue Waters machine, porting and optimizing Charm++/AMPI runtime system on Cray Gemini interconnect using low level Cray uGNI communication library, and scaling NAMD molecular dynamics simulation program on this machine. I have been using supercomputers including Hopper (NERCS), Titan (ORNL), JYC/ESS (NCSA), etc • Parallel Programming Laboratory, with Laxmikant Kal, UIUC, 1/99 - present I am the lead developer of Charm++ -- a parallel object-oriented language and run-time system. My work in general involves improving the performance and productivity in high performance computing on supercomputers and workstation clusters with Charm++ run-time system. My projects involves most aspects in Charm++ system and its applications, including Adaptive MPI (AMPI), automatic dynamic load balancing techniques to improve the scalability of parallel applications especially those challenging applications to scale on very large parallel machines. I also work on performance tracing and analysis tool associated with Charm++. My Ph.D. thesis focuses on large scale parallel simulation for predicting performance of parallel applications on extremely large parallel machines. With the simulation infrastructure, I explore the optimization techniques needed in automatic load balancing to improve the parallel performance on these machines. For many years, I am in charge of the Charm++/AMPI software development and maintanance work, including Charm++ regression nightly tests, and email support for external users, etc. I have been collaborating with several external groups on applications including molecular dynamics simulation such as NAMD and LeanMD (with IBM), climate simulation application (ISAM), and FEM applications such as Fractography3D (crack propagation simulation) and Rocstar (Rocket Simulation). • Center for Simulation of Advanced Rockets, UIUC, 5/05 - 2010 I joined the center as a postdoc research associate, my current research topics in the center is to exploit Charm++ in the advanced rocket simulation application to improve its parallel performance as well as the portability. I am one of the main developers on designing a software integration framework for multi-physics interaction and flexible high-level orchestration modules to ease quick prototyping of coupling schemes in the rocket simulation. • Theoretical Biophysics Group, Beckman Institute for Advanced Science and Technology, UIUC, 1/99 - 8/04 I was one of the main developers in parallelizing NAMD application developed in the group. NAMD is a parallel, object-oriented molecular dynamics code designed for high-performance simulation of large biomolecular systems. Based on Charm++ parallel objects and the load balancing framework (which I was working on), NAMD scales to hundreds of processors on high-end parallel platforms and tens of processors on commodity clusters using gigabit Ethernet at that time. Our work in NAMD won the prestigious Gordon Bell Award in SC2002 for unprecedented speedup on a 3000 processor machine with peak performance of a Teraflop. I was also actively involved in supporting NAMD user community via NAMD mailing list. I interacted with a great number of external NAMD users to help them with various portability and performance issues with NAMD and Charm++. After I left the group, I have been involved in NAMD project in various aspects as I continue to work on Charm++. • High Performance Computing Technology, development of High Performance Fortran Compiler, with Zhuoqun Xu, Beijing University, China, 9/95 - 7/98 HPF is a data parallel programming language. The aim of this project was to design and implement a practical HPF compiler and runtime system. My work included compiler front-end design and implementation, SPMD source to source code translation and implementation of the communication runtime system. This compiler was installed on DAWN1000, one of the fastest supercomputers built in China at that time. • Parallel Large Scale Image Processing, collaborated with Chinese Academy of Science, China, 9/97 - 7/98 In this project, we explored techniques to solve large scale image processing problems using HPF and the compiler we developed. We developed applications in HPF for image processing and improved the HPF compiler to achieve high performance. In order to handle the super large scale data, which can not be completely held in main memory, we designed and implemented parallel I/O in the HPF compiler runtime system to perform out-of-core execution for extremely large datasets. Synergistic Activities Member, Association for Computing Machinery (ACM) Member, IEEE Program Committee: IEEE International Parallel and Distributed Processing Symposium (IPDPS), Shanghai, China, 2012 Program Committee: NCSA Faculty Fellows, Spring 2014 Reviewer: Journal of System and Software, Journal of Parallel and Distributed Computing, International Journal of High Performance Computing, Euro-Par