Efficient Execution of Tightly-Coupled Parallel Applications in Grid Computing Environments
Thesis 2007
Publication Type: PhD Thesis
Repository URL:
Grid computing offers a model for solving large-scale scientific problems by uniting com- putational resources within multiple organizations to form a single cohesive resource for the duration of individual jobs. Such an infrastructure creates a pervasive and dependable pool of computing power that enables computational scientists to develop dramatically new classes of applications. Despite the appeal of Grid computing, developing applications that run efficiently in these environments often involves overcoming significant challenges. One challenge to deploying Grid applications across geographically distributed resources is overcoming the effects of latency between sites. Certain classes of applications, such as pipeline style or master-slave style applications, lend themselves well to running in Grid environments because the communication requirements of these types of applications can be varied readily and because most communication takes place outside the critical path. In contrast, tightly-coupled applications in which every processor performs the same task and communicates with some subset of all processors in the computation during every iteration present a significant challenge to deployment in Grid environments. Another challenge to deploying applications in Grid environments is managing the het- erogeneity that is frequently present across resources. Because supercomputing clusters in a Grid environment are often installed and upgraded independently, components such as processors and interconnects can present widely varying capabilities within a single Grid job. Tightly-coupled applications, however, especially require access to as many computa- tional resources as possible. For example, wasting processing resources due to inefficiently mapping work to processors of heterogeneous speeds within a single Grid job is unacceptable. Likewise, intra-cluster communication should take place as much as possible using high-performance cluster interconnects, resorting to lower performance wide-area protocols only when necessary. This thesis examines the feasibility of deploying tightly-coupled parallel applications in Grid computing environments. A desired outcome of this work is the capability of delivering application performance in a Grid environment that is on par with the performance within a single cluster while simultaneously requiring few or no modifications to application software. To that end, the thesis explores techniques that can be deployed effectively at the runtime system level and applied to a variety of application decomposition styles.
Research Areas