Someone recently asked me to help them to test out a computational code called gamess, unfortunately the code is distributed in binary form only. Not only that, there were many different versions for different architectures and mpi implementations. The kicker was we spent half an hour trying to get it to start up in parallel only to find that it kept spawning off single processes on different nodes. The reason for this was that it was a 32bit blob, our mpi libraries were all 64bit!
I ended up having to compile up a 32bit mpi implementation on the 64bit machines. The following command was used to configure openmpi 1.2.8
./configure --prefix=/misc/shared/apps/openmpi/gcc/32/1.2.8 \
CFLAGS=-m32 CXXFLAGS=-m32 FFLAGS=-m32 FCFLAGS=-m32 \
--with-wrapper-cflags=-m32 --with-wrapper-cxxflags=-m32 \
--with-wrapper-fflags=-m32 --with-wrapper-fcflags=-m32
This tip was found at http://www.open-mpi.org/community/lists/users/2008/08/6337.php and http://www.open-mpi.org/community/lists/users/2006/04/0992.php