Bug 5525 - Add support for OSC mpiexec (not MPICH2 command!) to PBS adapter
: Add support for OSC mpiexec (not MPICH2 command!) to PBS adapter
wsrf scheduler interface
: 4.0.4
: PC Linux
: P3 enhancement
: ---
Assigned To:
  Show dependency treegraph
Reported: 2007-09-02 07:43 by
Modified: 2012-09-05 11:43 (History)



You need to log in before you can comment on or make changes to this bug.

Description From 2007-09-02 07:43:28
First of all, I am not referring to the mpiexec command of MPICH2 here. If you
enter 'mpiexec' in Google, the following page comes up first, and this is what
we need: http://www.osc.edu/~pw/mpiexec/index.php

mpiexec is the recommended/comfortable way to run MPI jobs within PBS/TORQUE.
It implements many MPI-version-specific protocols for starting MPI tasks (makin
g it easy to switch from one MPI implementation to another) and supports
correct accounting of CPU time for parallel jobs. Its mode of use is that you
submit a PBS job, specifying your multi-node requirements as usual, and then
within the job script you run mpiexec as shown below:

/opt/mpiexec/bin/mpiexec -np 4 -comm mpich-p4 /path/to/your/mpiapp

If you want to switch from MPICH/Gigabit Ethernet to MVAPICH/InfiniBand:

/opt/mpiexec/bin/mpiexec -np 4 -comm mpich-ib /path/to/your/mpiapp

This is a lot simpler than the rsh/mpirun approach used by the current (4.0.5)
PBS adapter. There is no machinefile to care about. Also, you don't need to
enable password-less rsh access from the front machine to the execution nodes
or between nodes, provided that they share a file system with the front machine
(as is required by GRAM's file staging anyway). Finally, the interface exposed
by mpiexec (as can be seen above) is much simpler than the mpirun/rsh
interface, which would simplify the implementation of the PBS adapter.
------- Comment #1 From 2008-01-30 07:48:31 -------
This request for enhancement should probably be generalized. The main current
weakness is that the mpirun mechanism in pbs.pm is not configurable enough. In
addition to the already mentioned desirable "out-of-the-box" support for OSC's
'mpiexec', there should be a way to choose among multiple MPI versions
installed at the particular site - without the need to hack pbs.pm or creating
a custom jobmanager script based on pbs.pm. 

Having multiple MPI versions is not unrealistic today: in our scenario the
application code has to be built with a supported commercial compiler and the
(Open)MPI library providing the runtime support must be compiled with the same
compiler. However, other users of the same site may wish to compile their
programs with gcc and accordingly use the gcc-compiled version of MPI.
------- Comment #2 From 2012-09-05 11:43:37 -------
Doing some bugzilla cleanup...  Resolving old GRAM3 and GRAM4 issues that are
no longer relevant since we've moved on to GRAM5.  Also, we're now tracking
issue in jira.  Any new issues should be added here: