The Faucets client interface.
|
As scientists write more and more parallel applications that require large
amounts of computational resources, large parallel machines will become more
commonplace. However, these machines are often prohibitively expensive
for any single user and so must be shared by many users.
Furthermore, the machines tend to go through periods of low and high
utilization. During the periods of high utilization, users might not be able to
get enough computational resources to get results in time to meet a deadline.
During the periods of low utilization, the computational power sits idle while
someone across the country may be wishing they had just a few more nodes so
they could run their program.
The Faucets system addresses this second problem by allowing subscribers
to share their resources. In the Faucets system, computational power is
viewed as a utility. A supercomputer produces the utility and the users consume
it. When a user finds the cluster they normally use is not providing enough
of the utility, they can "borrow" some from another subscribing cluster which
is currently being underused. In the future, a user of the second cluster
might run an application on the first when the situation is reversed.
Faucets provides both the technical framework and economic model to
facilitate the discovery, distribution, and pricing of computational
power.
|
As part of our Faucets project, we present several complementary
ideas and their implementation that improve the efficiency of
parallel servers in the Grid environment:
- Bartering: a system which permits cluster maintainers to exchange
computational power with each other. Jobs are submitted to the Faucets
system with a Quality of Service requirement and subscribing clusters return
bids; the best bid which meets all criteria is selected. When an application
is succesfully run by the bidding cluster, it is awarded bartering units which
its users can later trade for the use of resources on other subscribing
clusters.
- Adaptive Jobs: a system that supports adaptive jobs (also known as
malleable and/or evolving jobs) that can shrink and expand to a variable
number of processors at run-time, and an Adaptive Job Scheduler for timeshared
parallel machines that exploits this ability.
- On-Demand Computing: users can get the resources they need when they need
them. Furthermore, they can specify that their jobs need to be
run immediately. High priority jobs can preempt low priority jobs so that
computational resources are used as efficiently as possible.
- Scheduler: we provide a scheduling system for clusters which can either be
used in conjunction with Faucets or as a stand-alone scheduler.
Adaptive jobs should be written to use our version of MPI (called
AMPI for Adaptive MPI),
Charm++, or other supported languages.
Technical Overview of the Faucets System
Cluster Scheduler Manual
|
- 04-09
L.V.Kale, Sameer Kumar, Jayant DeSouza, Mani Potnuru, and Sindhura Bandhakavi, Faucets: Efficient Resource Allocation on the Computational Grid, Proceedings of the 2004 International Conference on Parallel Processing (ICPP 2004), 15-18 August 2004, Montreal, Quebec, Canada.
- 03-14
Sindhura Bandhakavi, Analyzing Bidding Strategies For Schedulers In A Simulated Multiple-Cluster Market Driven Environment, Master's Thesis, Dept. of Computer Science, University of Illinois 2003
- 03-01
L.V.Kale, Sameer Kumar, Jayant DeSouza, Mani Potnuru, and Sindhura Bandhakavi, Faucets: Efficient Resource Allocation on the Computational Grid, PPL Technical Report 03-01, University of Illinois at Urbana-Champaign, Mar 2003
- 02-01
L. V. Kale, Sameer Kumar, and Jayant DeSouza, A Malleable-Job System for Timeshared Parallel Machines, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid 2002), May 21-24, 2002, Berlin, Germany.
- 01-06
Sameer Kumar, Thesis: An Adaptive Job Scheduler for Timeshared Parallel Machines, Master's Thesis, Dept. of Computer Science, University of Illinois 2001
- 00-02
L. V. Kale, Sameer Kumar, and Jayant DeSouza, An Adaptive Job Scheduler for Timeshared Parallel Machines, PPL Technical Report 00-02, University of Illinois at Urbana-Champaign, Sep 2000.
|