I recently patched 'fsl_sub', which is apart of fsl, to allow it to batch submit tasks to our computer clusters. I'd submit the patch to the authors of the software but I couldn't figure out where to send the patch to. I really didn't need to sign up to another mailing list. So here it is.
--- fsl_sub.orig 2010-06-11 13:03:35.279077000 +0100
+++ fsl_sub 2010-06-11 13:04:33.409821000 +0100
@@ -100,6 +100,10 @@
fi
fi
+if [ "x$SLURM_JOB_ID" != "x" ] ; then
+ METHOD=SLURM
+fi
+
###########################################################################
# The following auto-decides what cluster queue to use. The calling
@@ -123,6 +127,11 @@
queue=verylong.q
fi
#echo "Estimated time was $1 mins: queue name is $queue"
+
+ # if slurm environment is detected use the compute partition, change this to suit
+ if [ $METHOD = SLURM ] ; then
+ queue=compute
+ fi
}
@@ -200,7 +209,7 @@
# change. It also sets up the basic emailing control.
###########################################################################
-queue=long.q
+queue=compute
mailto=`whoami`@fmrib.ox.ac.uk
MailOpts="n"
@@ -364,6 +373,40 @@
;;
###########################################################################
+# SLURM method
+# this is a very naive way of doing things, its just to simply fire off all
+# the tasks individually to the resource manager
+###########################################################################
+
+ SLURM)
+ if [ $verbose -eq 1 ] ; then
+ echo "Starting Slurm submissions..." >&2
+ fi
+ _SRMRAND=$RANDOM
+ _SRMNAME=$JobName$SRMRAND
+ echo "========================" >> sbatch.log
+ echo "= Starting submissions =" >> sbatch.log
+ echo "========================" >> sbatch.log
+ date >> sbatch.log
+while read line
+do
+ if [ "x$line" != "x" ] ; then
+sbatch -J $_SRMNAME -o "slurm-log-$_SRMNAME-%j-%N.out" -t 01:00:00 -p $queue -n 1 <<EOF
+#!/bin/sh
+echo
+echo $SLURM_JOB_NAME
+echo $SLURM_JOB_ID
+echo $SLURM_JOB_NODELIST
+echo
+date
+echo
+$line
+EOF
+ fi
+done < $taskfile >> sbatch.log 2>&1
+ ;;
+
+###########################################################################
# Don't change the following - this runs the commands directly if a
# cluster is not being used.
###########################################################################
It's not the prettiest thing around, its quick and dirty and it spams the queue system pretty good. It's cut down a job which did take 5-6days to about 8-9hrs. That is it used to run on 1cpu, now it runs on 24-32cpus at a time.
Add a comment