Faucets Scheduler Man Page

The Faucets Cluster Scheduler

Faucets schedules jobs on a first come first served basis until the nodes are completely allocated. For queued jobs the scheduler will favor jobs belonging to a user without currently running jobs as nodes become available. Research into fairer resource allocation schemes which maintain good utilization is ongoing. Jobs with runtimes longer than 24 hours will be limited to one third of the cluster.

A rough approximation of cluster usage can be found at Cluster Viewer.

Job Submission:

fsub, submits a job to the queuing system as a batch job. ufsub and fsub-mpi submit batch single processor and mpi jobs.
fsub [program_name] [options] [arguments] A typical single processor batch submission line on the architecture cluster would be:
ufsub mysimscript.sh -stdout myoutput.out -time "4:0:0"
program_name is the name of your compiled executable or script,
You should NOT provide an mpirun or a charmrun on the command line.
All jobs should have a predicted completion time. As this will facilitate more efficient scheduling of resources. If you do not supply a time the scheduler will assume 12 hours. Running jobs will be terminated after their time completes.

Options
- -stdout [filename]: The Std. Output file of the job (default /dev/null)
- -stderr [filename]: The Std. Error file of the job (default /dev/null)
- -stdin [filename]: The Std. Input file of the job (default /dev/null)
- -type [mpi/charm/mcharm/uni]: The type of the job (default charm)
- -time [hh:mm:ss] : time requested for the job (12 hours by default)
- -pwd [directory name]: Process Working Directory, (default current directory)
- -name [name]: Name that will be reported by fjobs, no spaces allowed (default executable name)
- +n : Number of nodes requested for the job
- +ppn : Number of processors per node to use
- +p : Alternative to +n<> and +ppn<>. Number of processors requested for the job, assumes that all processors on a node will be used
  
  If a processor range needs to be provided for the job
- -minnode : minimum number of nodes used by the job
- -maxnode : maximum number of nodes used by the job
- -minpe : minimum number of processors used by the job
- -maxpe : maximum number of processors used by the job
  #Processors allocated to the job varies between these bounds
- ++notify-start, to send email when job starts
- ++notify-end, to send email when job finishes
  default email address used is userid@arch-gw.cs.uiuc.edu
- -email [address], to send email to address, also sets notify-end by default
- ++debug, to enable charm++ programs to be debugged, SHOULD HAVE DISPLAY set
Commands ufrun, frun-mpi, and frun run a jobs interactively. Command ufrun submits single processor interactive jobs and frun-mpi submits interactive MPI jobs. Command frun is the generic command to submit jobs and it takes job type as an argument. By default frun runs a charm job.
frun [program_name] [options] [arguments]
frun interactively runs the job on the parallel machine (cool cluster)
program_name is the name of your compiled executable or script,
you should NOT provide an mpirun or a charmrun
Options
- -I : For interactive debugging jobs. It opens up an xterm and sets the nodelist to the nodes allocated by the scheduler. You can then run anything you wish from the shell. Primarily useful for long compilation sessions or porting work best kept off the head node. For multiprocessor jobs consult the .nodelist file written by Faucets for your node allocation.
- -stdout [filename]: The Std. Output file of the job (default frun terminal)
- -stderr [filename]: The Std. Error file of the job (default frun terminal)
- -stdin [filename]: The Std. Input file of the job (default frun terminal)
- -type [mpi/charm/mcharm/uni]: The type of the job (default charm)
- -time [hh:mm:ss] : time requested for the job (12 hours by default)
- -pwd [directory name]: Process Working Directory, (default current directory)
- -name [name]: Name that will be reported by fjobs, no spaces allowed (default executable name)
- +n : Number of nodes requested for the job
- +ppn : Number of processors per node to use
- +p : Alternative to +n<> and +ppn<>. Number of processors requested for the job, assumes that all processors on a node will be used
  
  If a processor range needs to be provided for the job
- -minnode : minimum number of nodes used by the job
- -maxnode : maximum number of nodes used by the job
- -minpe : minimum number of processors used by the job
- -maxpe : maximum number of processors used by the job
  #Processors allocated to the job varies between these bounds
- ++notify-start, to send email when job starts
- ++notify-end, to send email when job finishes
  default email address used is userid@cool2.cs.uiuc.edu
- -email [address], to send email to address, also sets notify-end by default
- ++debug, to enable charm++ programs to be debugged, SHOULD HAVE DISPLAY set
For Example :
frun +n2 +ppn2 ./hello -time 1:0:0
runs the charm program hello interactively on 2 nodes and 4 processors for 1 hour.
ufrun /bin/ls,
runs /bin/ls on one processor. For single processor jobs ufrun should be used. frun +n2 +ppn2 ./hello -time 1:0:0
frun +n2 +ppn2 ./hello_mpi -time 1:0:0
runs the mpi program hello_mpi interactively on 2 nodes and 4 processors for 1 hour.

Job Management:

fkill, kills the job

fkill [job_id]

fjobs, reports the status of all the running and queued jobs.

fwho, reports a summary of usage of the cluster

Problems

If the cluster itself is unreachable or has obvious problems send email to Userhelp. If the scheduler takes longer than five minutes to respond, or returns an error send email to Eric Bohm. The cluster itself is maintained by TSG, so standard sysadmin issues should be directed to them. PPL only maintains the scheduler and does not have the power or authority to create accounts or other system administration level tasks.

The architecture cluster occasionally suffers from transient network faults which make job launching difficult. TSG is studying the problem. The scheduler may appear unresponsive (i.e. hung) as it tries to filter out those nodes. Workarounds to improve scheduler response time under adverse conditions are under development.

Node sharing:

At the request of the architecture user community each processor of the dual node cluster is individually scheduled. Therefore, your jobs may share nodes with other jobs that have been scheduled to run on the other processor of a node within your allocation. Options to minimize node sharing across users are under development.

Head node use:

Avoid processor intensive operations on the headnode. The scheduler and its database run on the head node (arch-server) and performance will be negatively impacted if you run CPU bound operations there. Use arch-devel for heavy compilation or analysis work.