SGE to SLURM conversion
Sun Grid Engine (SGE) and SLURM job schedulers have similar concepts; here's a table comparing common SGE commands with their SLURM equivalents.
Some common commands and flags in SGE and SLURM with their respective equivalents:
User Commands | SGE | SLURM |
---|---|---|
Interactive login | qlogin | srun --pty bash or srun -p "partition name" --time=4:0:0 --pty bash. |
Job submission | qsub [script_file] | sbatch [script_file] |
Job deletion | qdel [job_id] | scancel [job_id] |
Job status by job | qstat -u \* [-j job_id] | squeue [job_id] |
Job status by user | qstat [-u user_name] | squeue -u [user_name] |
Job hold | qhold [job_id] | scontrol hold [job_id] |
Job release | qrls [job_id] | scontrol release [job_id] |
Queue list | qconf -sql | squeue |
List nodes | qhost | sinfo -N OR scontrol show nodes |
Cluster status | qhost -q | sinfo |
|
|
|
Environmental |
|
|
Job ID | $JOB_ID | $SLURM_JOBID |
Submit directory | $SGE_O_WORKDIR | $SLURM_SUBMIT_DIR |
Submit host | $SGE_O_HOST | $SLURM_SUBMIT_HOST |
Node list | $PE_HOSTFILE | $SLURM_JOB_NODELIST |
Job Array Index | $SGE_TASK_ID | $SLURM_ARRAY_TASK_ID |
|
|
|
Job Specification |
|
|
Script directive | #$ | #SBATCH |
queue | -q [queue] | -p [queue] |
count of nodes | N/A | -N [min[-max]] |
CPU count | -pe [PE] [count] | -n [count] |
Wall clock limit | -l h_rt=[seconds] | -t [min] OR -t [days-hh:mm:ss] |
Standard out file | -o [file_name] | -o [file_name] |
Standard error file | -e [file_name] | e [file_name] |
Combine STDOUT & STDERR files | -j yes | (use -o without -e) |
Copy environment | -V | –export=[ALL | NONE | variables] |
Event notification | -m abe | –mail-type=[events] |
send notification email | -M [address] | –mail-user=[address] |
Job name | -N [name] | –job-name=[name] |
Restart job | -r [yes|no] | –requeue OR –no-requeue (NOTE: |
Set working directory | -wd [directory] | –workdir=[dir_name] |
Resource sharing | -l exclusive | –exclusive OR–shared |
Memory size | -l mem_free=[memory][K|M|G] | –mem=[mem][M|G|T] OR –mem-per-cpu= |
Charge to an account | -A [account] | –account=[account] |
Tasks per node | (Fixed allocation_rule in PE) | –tasks-per-node=[count] |
|
| –cpus-per-task=[count] |
Job dependancy | -hold_jid [job_id | job_name] | –depend=[state:job_id] |
Job project | -P [name] | –wckey=[name] |
Job host preference | -q [queue]@[node] OR -q | –nodelist=[nodes] AND/OR –exclude= |
Job arrays | -t [array_spec] | –array=[array_spec] |
Generic Resources | -l [resource]=[value] | –gres=[resource_spec] |
Lincenses | -l [license]=[count] | –licenses=[license_spec] |
Begin Time | -a [YYMMDDhhmm] | –begin=YYYY-MM-DD[THH:MM[:SS]] |
SGE for a single-core application | SLURM for a single-core application |
---|---|
#!/bin/bash
#
#
#$ -N test
#$ -j y
#$ -o test.output
#$ -cwd
#$ -M $USER@stanford.edu
#$ -m bea
# Request 5 hours run time
#$ -l h_rt=5:0:0
#$ -P your_project_id_here
#$ -l mem=4G
#
<load modules,call your app here> | #!/bin/bash
#
#SBATCH -J test
#SBATCH -o test."%j".out
#SBATCH -e test."%j".err
# Default in slurm
#SBATCH --mail-user $USER@cumc.columbia.edu
#SBATCH --mail-type=ALL
# Request 5 hours run time
#SBATCH -t 5:0:0
#SBATCH --mem=4000
#SBATCH -p normal
<load modules, call your app here> |