Submiting a Batch Job
Batch jobs are subbimted using the command:[depietri@albert PBS]$ qsub -q [queue] [batchscript.sh]
or, if a different than defalt number of nodes is request, by the commnad:
[depietri@albert PBS]$ qsub -q [queue] -l nodes=[nnodes:queue] [batchscript.sh]
Here:
[queue] is one of the five configurated queues n1, n8, n16, n64 or n88;
[batchscript.sh] is the name of the script file to be submited;
[nnodes:queue] is the number of processor that should be allocated for the batch job.
The number of proccessor to be allocated may be declared on the [batchscript.sh] script file or the queue default number of nodes is allocated. On albert100 the default number of allocated processor is: 1 for the queues n1, 8 for the queues n8, 16 for the queues n16, 64 for the queues n64 and 88 for the queues n88
Main commands:
qstat -q (shows theconfigured queue)qsub jobscript (submit the job to the default queue)
qsub -q <qname> jobscript (submit the job to the queue <qname>)
qsub -a (shows the submitted jobs)
qdel jobid (cancels the specified jobid from the default queue)
Example of a single node Job Script
#PBS -S /bin/sh #PBS -m ae #PBS -M depietri@albert.pr.infn.it ## -------------------- PREAMBLE ----------------------------------- export RUN_HOST=`hostname` n=`wc -l < $PBS_NODEFILE` cd $PBS_O_WORKDIR ## -------------------- END PREAMBLE -------------------------------- echo "================================================================================" echo "." echo ". Albert100 single processor BATCH JOB" echo ". ------------------------------------" echo "." echo ". Job Running on queue: $PBS_QUEUE" echo "." echo ". Running on HOST: $RUN_HOST" echo ". Allocated processor: $n" echo ". Working Dir: `pwd`" echo "." echo "================================================================================" echo .
Example of a MPICH/Cactus Job Script
#PBS -S /bin/sh #PBS -m ae #PBS -M depietri@albert.pr.infn.it ## -------------------- PREAMBLE ----------------------------------- export MFILE="$PBS_O_WORKDIR/mfile.$PBS_QUEUE.$PBS_JOBID" export MPIRUN="/opt/mpich-1.2.0-smp/bin/mpirun" export RUN_HOST=`hostname` n=`wc -l < $PBS_NODEFILE` cd $PBS_O_WORKDIR /opt/bin/machinefile2smp < $PBS_NODEFILE > $MFILE ## -------------------- END PREAMBLE -------------------------------- echo "================================================================================" echo "." echo ". Albert100 mpich 1.2.0 BATCH JOB" echo ". -------------------------------" echo "." echo ". Job Running on queue: $PBS_QUEUE" echo "." echo ". Root host: $RUN_HOST" echo ". Running on: $n processors" echo ". Working Dir: `pwd`" echo ". Using mpirun: $MPIRUN" echo ". -machinefile: $MFILE" echo "--------------------------------------------------------------------------------" cat $MFILE echo "================================================================================" echo . $MPIRUN -np $n -machinefile $MFILE ./BenchADMsmp BenchADM.par \rm $MFILE
Example of a LAM/Cactus Job Script
#PBS -S /bin/sh #PBS -m ae #PBS -M depietri@albert.pr.infn.it ## -------------------- PREAMBLE -------------------------------------------------- ## ## Here: We set as working directory the one where the qsub command as been lunched. ## We generate the machine file for the requested nodes ## The appropriate mpirun command is set ## source /opt/bin/LAM-6.5.6.sh export MFILE="$PBS_O_WORKDIR/mfile.$PBS_QUEUE.$PBS_JOBID" export MPIRUN="/opt/lam-6.5.6-usysv.1/bin/mpirun" export RUN_HOST=`hostname` n=`wc -l < $PBS_NODEFILE` cd $PBS_O_WORKDIR /opt/bin/machinefile2lam < $PBS_NODEFILE > $MFILE ## -------------------- END PREAMBLE ------------------------------------------------ echo "================================================================================" echo "." echo ". Albert100 LAM 6.5.6 BATCH JOB" echo ". -----------------------------" echo "." echo ". Job Running on queue: $PBS_QUEUE" echo "." echo ". Root host: $RUN_HOST" echo ". Running on: $n processors" echo ". Working Dir: `pwd`" echo ". Using mpirun: $MPIRUN" echo ". -machinefile: $MFILE" echo "--------------------------------------------------------------------------------" cat $MFILE echo ================================================================================ echo . lamboot -v $MFILE $MPIRUN -np $n ./BenchADM-lam BenchADM.par lamhalt # ----------------------------------------------------- # We can now delete the temporary generate Machine File # ----------------------------------------------------- \rm $MFILE