We used to use Maui and Torque for all our resource management needs in work. Then we had a brief encounter with Sun Grid Engine as well as SLURM. It mostly worked okay, till we decided to introduce GOLD into the mix for various reasons (funding agencies and accountability). GOLD provides all the banking functionality to limit and account for hours spent on projects.

We managed to get SLURM, Maui and GOLD to work together to provide us a relatively reliable resource management system with accounting/banking. Quite frankly Maui's documentation is pretty poor as is GOLD, not only that GOLD tended to be overly complex, slow and clunky. This is probably because the code base is quite old and it was and still is pretty much the go to tool for a lot of people for doing accounting and banking at HPC centres.

We had known that SLURM had some of the capabilities for doing banking and accounting, but it lacked a few features to make it a viable alternative to the existing Maui, SLURM and GOLD setup that we had.

After a few emails with the SLURM developers

and some experimenting, we came to the conclusion that all that was missing were just a few helper scripts and utilities were needed to provide a simple banking system for SLURM.

The end result was slurm-bank, which is just a collection of shell and perl scripts to wrap up some of the existing SLURM functionality along with a description/design on how SLURM should be configured to provide GOLD like functionality.

The scripts are pretty simple and mostly dumb, but they work well enough at my current work place in TCHPC. There's a bunch of wanted features that will probably take more time to flesh out. There will probably be a need for a rewrite at some point when the ideas are more tested and developed. For now we are able to just run SLURM (without Maui and GOLD) at our site. Hopefully the SLURM developers will take on board some of the ideas that we have mashed up and scripted up in a really bad and dodgy way.

To get the code

git clone git@github.com:jcftang/slurm-bank.git

There are tags in the repo that you can checkout, the tags are relatively stable and reliable, as is the stable branch.


Related posts:

Memory debuggers and garbage collectors for C/C++
Posted

The ultimate sandbox game, ever!
Posted

Astro, Particle, High Energy physics is cool
Posted

Creating a git branch with no ancestry
Posted

Using gource to visualise projects stored in version control
Posted

Bookmark and Share