I recently patched 'fsl_sub', which is apart of fsl, to allow it to batch submit tasks to our computer clusters. I'd submit the patch to the authors of the software but I couldn't figure out where to send the patch to. I really didn't need to sign up to another mailing list. So here it is.

--- fsl_sub.orig        2010-06-11 13:03:35.279077000 +0100
+++ fsl_sub     2010-06-11 13:04:33.409821000 +0100
@@ -100,6 +100,10 @@
     fi
 fi

+if [ "x$SLURM_JOB_ID" != "x" ] ; then
+       METHOD=SLURM
+fi
+

 ###########################################################################
 # The following auto-decides what cluster queue to use. The calling
@@ -123,6 +127,11 @@
        queue=verylong.q
     fi
     #echo "Estimated time was $1 mins: queue name is $queue"
+
+    # if slurm environment is detected use the compute partition, change this to suit
+    if [ $METHOD = SLURM ] ; then
+           queue=compute
+    fi
 }


@@ -200,7 +209,7 @@
 # change. It also sets up the basic emailing control.
 ###########################################################################

-queue=long.q
+queue=compute
 mailto=`whoami`@fmrib.ox.ac.uk
 MailOpts="n"

@@ -364,6 +373,40 @@
        ;;

 ###########################################################################
+# SLURM method
+# this is a very naive way of doing things, its just to simply fire off all
+# the tasks individually to the resource manager
+###########################################################################
+
+       SLURM)
+               if [ $verbose -eq 1 ] ; then
+                       echo "Starting Slurm submissions..." >&2
+               fi
+               _SRMRAND=$RANDOM
+               _SRMNAME=$JobName$SRMRAND
+               echo "========================" >> sbatch.log
+               echo "= Starting submissions =" >> sbatch.log
+               echo "========================" >> sbatch.log
+               date >> sbatch.log
+while read line
+do
+        if [ "x$line" != "x" ] ; then
+sbatch -J $_SRMNAME -o "slurm-log-$_SRMNAME-%j-%N.out" -t 01:00:00 -p $queue -n 1 <<EOF
+#!/bin/sh
+echo 
+echo $SLURM_JOB_NAME
+echo $SLURM_JOB_ID
+echo $SLURM_JOB_NODELIST
+echo
+date
+echo
+$line
+EOF
+        fi
+done < $taskfile >> sbatch.log 2>&1
+       ;;
+
+###########################################################################
 # Don't change the following - this runs the commands directly if a
 # cluster is not being used.
 ###########################################################################

It's not the prettiest thing around, its quick and dirty and it spams the queue system pretty good. It's cut down a job which did take 5-6days to about 8-9hrs. That is it used to run on 1cpu, now it runs on 24-32cpus at a time.

Bookmark and Share