Installing and Testing MPICH

MPICH was selected as the initial communications software package for hrothgar. (Others, such as PVM, might be installed in the future, depending on user demand.) The overall implementation strategy generally follows that described in the Caltech/CACR Beowulf Tutorial, with all the MPICH software installed on a partition of the Master Node which is NFS-mountable by the other nodes.

Contents:


Obtaining The MPICH Source

The Argonne MPICH site provides the MPICH source via an FTP Link. This provides are zipped tar file, a README and other indispensible documentation.

In our case, things were even easier, as the CACR version of the RedHat 5.1 CDROM already included the full MPICH contents from ANL. The relevant files extracted from the CDROM were:

mpich-1.1.0.tar.gz The distribution source.
mpich-install.ps The MPICH Installation Manual
mpich-userguide.ps The MPICH User Gide

The installation and user guides were extracted and printed as the obvious first step.


Installing and Configuring MPICH

Following/mixing advice from both the MPICH and CACR Beowulf documentation, the installation procedure for MPICH was as follows:

Put The Source In An Appropriate Place

Since /scratch00 is mounted as /home on all hrothgar nodes, this seemed like an obvious place to put things.

After logging into hrothgar as root and making sure that the CDROM drive was mounted (and contained the RedHat CD), the commands were simply.

cd /home cp /mnt/cdrom/cacr/comm-libs/mpich-1.1.0.tar.gz . gunzip -c mpich.tar.gz | tar xvf -
This creates a full MPICH distribution tree in /home/mpich.,

Configuring and Making MPICH

From section 4 of the MPICH Installation Guide, an adequate configration and make sequence is simply:
cd /home/mpich ./configure -device=ch_p4 -arch=LINUX >& config.log make >& make.log
Configuration took maybe a minute and the entire make was done in under ten minutes. No errors (and only a few apparently inconsequential warnings) were generated in either the configure or the make.

The machines.LINUX File

MPICH expects to find a list of available machines in the file
../mpich/util/machines/machines.LINUX
The initial form of this file on hrothgar was as follows:
dan01 dan02 dan03 dan00 dan01 dan02 dan03 dan00

Important Remarks:

  1. The simple machine enumeration in machines.LINUX will not work unless the .rhosts and /etc/hosts modifications noted in the Remote Shell Permissions section of the System-Level Configurations document have been made on all nodes.
  2. The "multiple listings" of danNM in the machines.LINUX file allows MPICH runs with more than four "nodes", using multiple processes per PC. The machines.LINUX file must have at least five entries in order to exercise the performance tests included in the MPICH distribution.
  3. Note that the "main" PC, dan00, is listed last within machines.LINUX. This is done so that processes are first assigned to the otherwise idle "secondary PCs" (some of the main PC resources will be used in login sessions, compilations, etc.).

Making MPI User-Accessible

At this point, MPICH is up and running on the system, and the last task is to make it accessible to users. An easy method adopted for hrothgar was to modify /etc/profile so that MPI compilation and execution commands were in the standard PATH. The relevant new lines from the modified file are as follows:
PATH="$PATH:/usr/X11R6/bin:/scratch00/mpich/bin" PATH="$PATH:/scratch00/mpich/lib/LINUX/ch_p4"
so that, for example, a user could complie and execute a simple program on three processors with the command sequence:
mpicc -c my_prog.c
mpicc -o MY_PROG mp_prog.o
mpirun -np 3 MY_PROG
(Curiously, the on-line help for mpicc suggets that compilations and linkings not be combined in a single command.)
With these steps, MPICH was up and running on hrothgar. The installation was tested using the mpich/examples/basic/cpi program and the more extensive test suite in mpich/examples/test.


Random Remarks

  1. The Argonne National Laboratory MPI Home Page is a great place to start with any any questions about MPI.
  2. In practice, of course, the actual steps taken in bringing up hrothgar were rather more awkward than suggested above. In particular, the terse response "permission denied" became all too familiar until the proper forms for .rhosts and /etc/hosts were determined.
  3. In this user's opinion, the MPICH source and documentation package is excellent. Especially when compared with the switch wiring fiasco, the installation of MPICH was a dream.