Having helped build a largish infiniband cluster a few years ago (it was big at the time), and admining it for a few years. I think I've learnt what not to do when building, setting up or admining the machine. This is just a rant, perhaps I should compile a list of software/hardware that I like for building and running a High Performance Computer. There are probably many other things that I have probably missed or you may not agree with my opinions.

Some of the do not do's when building, procuring and or admining a cluster

  • Try and avoid tier two or tier three sized vendors for medium to large machines
  • Try and avoid none standard linux distros that deviate too far from RHEL, SLES or Debian based distros. If you are on a small budget centos or scientificlinux is a good RHEL substitute. Pick a distro that will have decent support for the hardware and or software you intend on running.
  • Try and pick tools and helper applications for running the infrastructure and the cluster that are standard and opensource when possible. Don't overly rely on vendor tools (that the company has written). E.g. why use a commercial job scheduler when you can use a free one that you have access to the source code and can install it on different clusters to provide a uniform interface for your users.
  • Try not to edit files and then copy the files out to your compute nodes. It's a better idea to script it or package up the changes so you can systematically and automatically roll out these changes.
  • Survey the type of applications your intend on running, get sample datasets and job runs for benchmarking. You should try and benchmark your codes on the machine that the vendor is trying to sell to you. It's usually a bad idea to listen to vendors raving on about how great performance is going to be. When requesting benchmarks, be reasonable about the job sizes that will get tested otherwise the vendors will probably ignore you.
  • Sometimes its also good to ignore your own users requests, e.g. your users might request more ram in a node than they actually need, sometimes its better to force the users to scale out to more nodes to force them to develop better scalable codes. Or to force them to pick better codes that scale better.
  • Use a centralised user management system such as ldap and not NIS which doesn't scale.
  • Make sure your boot and install process for your compute nodes are scalable and repeatable. In theory network booting and doing an auto install or check on every boot is a good idea, but in practice it doesn't scale unless you have a few boot/install servers running on your network. If you have more than 100nodes on your system and you can only boot 10 nodes at a time it can get quite annoying.
  • When procuring a new machine, try to avoid enabling licenses for software that you have purchased until the last possible moment when you are sure that the hardware is functioning within expected parameters.
  • Vendors like simulating hardware failures, network failures and power outages to test redundancy (if you have requested that certain parts of your machine be redundant). Let the vendor do their thing, but don't trust them, physically go in and unplug network cables, or pull out power cables on machines that have a UPS attached and make sure things failover or shutdown correctly in case of a failure.
  • Make sure you have access to some sort of a serial login or virtual console to your compute nodes, that is make sure IPMI or something similar is available. Otherwise you will be making lots of trips to the machine room to turn things on and off.
  • If your cluster is going to be sufficiently big, try and get a second login node and replicate some of the system services there. This makes management of the cluster much nicer if you need to test things out or reboot the head nodes for security updates etc... or to get users to only compile things on one node and not the one.
  • Try and budget for support for the storage or file system that you are going to get (especially if you plan on running a parallel file system). Or else budget for tape backup for disaster recovery. The last thing you want is loosing user data that might have taken months to generate in the event of a file system / storage problem.
  • Try and get a commerical compiler or two for your hardware. GCC is good, but it doesn't always generate the best possible code for the hardware you are running on.
  • Sometimes vendors write software for monitoring compute nodes, and sometimes this software is gets too big in terms of memory and cpu usage, and sometimes its just a good idea to ignore it and use more lightweight apps to do the same job.
  • Don't listen to the sales guy and buy the biggest and fastest machine they have, it's just not worth paying 10-20% more for a less than 10% performance increase. Look at the price vs. performance curve and decide.
  • Whilst GPU's and offload cards are great for certain applications, sometimes vendors will try and sell you some. Don't believe in some of the marketing or sales hype. If you can test it out then do so, and if your applications take advantage of it then great. If not, don't spend too much time at thinking about or discussing it with the vendor.
  • Don't upgrade the software stack for the sake of upgrading, when your cluster works, just put a freeze on things. You don't want to break a complex and working machine, assuming it is working.
  • Use modules, or at least adopt the methodology of installing multiple versions of software into shared and versioned directories (see gnu stow for an example of what I mean)
  • No matter how much documentation you do for your cluster/queuing system, users will almost never customise the example scripts your give them, so make sure you give them scripts that output lots of useful debugging info from the queuing system and make sure you give them good examples.
  • Make sure the racks you get have wheels!!! or else make sure your vendor will deal with all the hardware installation at your site etc...
  • Don't assume that your machine room has sufficient cooling and power. Check and make sure your site meets the minimum requirements.
Bookmark and Share