|
Table of contents:
- Open MPI terminology
- How do I get a copy of the most recent source code?
- Ok, I got a Subversion checkout. Now how do I build it?
- What is the main tree layout of the Open MPI source tree? Are
there directory name conventions?
- Is there more information available?
- More coming...
Open MPI is a large project containing many different
sub-systems and a relatively large code base. Let's first cover some
fundamental terminology in order to make the rest of the discussion
easier.
Open MPI has three sections of code:
- OMPI: The MPI API and supporting logic
- ORTE: The Open Run-Time Environment (support for different
back-end run-time systems)
- OPAL: The Open Portable Access Layer (utility and "glue" code
used by OMPI and ORTE)
There are strict abstraction barriers in the code between these
sections. That is, they are compiled into three separate libraries:
libmpi, liborte, and libopal with a strict dependency order:
OMPI depends on ORTE and OPAL, and ORTE depends on OPAL. More
specifically, OMPI executables are linked with:
shell$ mpicc myapp.c -o myapp
# This actually turns into:
shell$ cc myapp.c -o myapp -lmpi -lopen-rte -lopen-pal ...
|
More system-level libraries may listed after -lopal, but you get the
idea.
Strictly speaking, these are not "layers" in the classic software
engineering sense (even though it is convenient to refer to them as
such). They are listed above in dependency order, but that does not
mean that, for example, the OMPI code must go through the ORTE and
OPAL code in order to reach the operating system or a network
interface.
As such, this code organization more reflects abstractions and
software engineering, not a strict hierarchy of functions that must be
traversed in order to reach lower layer. For example, OMPI can call
OPAL functions directly -- it does not have to go through ORTE.
Indeed, OPAL has a different set of purposes than ORTE, so it wouldn't
even make sense to channel all OPAL access through ORTE. OMPI can
also directly call the operating system as necessary. For example,
many top-level MPI API functions are quite performance sensitive; it
would not make sense to force them to traverse an abritrarily deep
call stack just to move some bytes across a network.
Here's a list of terms that are frequently used in discussions about
the Open MPI code base:
- MCA: The Modular Component Architecture (MCA) is the foundation
upon which the entire Open MPI project is built. It provides all the
component architecture services that the rest of the system use.
Although it is the fundamental heart of the system, it's
implementation is actually quite small and lightweight -- it is
nothing like CORBA, COM, JINI, or many other well-known component
architectures. It was designed for HPC -- meaning that it is small,
fast, and resonably efficient -- and therefore offers few services
other finding, loading, and unloading components.
- Framework: An MCA framework is a construct that is created
for a single, targeted purpose. It provides a public interface that
is used by external code, but it also its own internal services. A
list of Open MPI frameworks is available here. An MCA
framework uses the MCA's services to find and load components at run
time -- implementations of the framework's interface. An easy example
framework to discuss is the MPI framework named "
btl", or the Byte
Transfer Layer. It is used to sends and receive data on different
kinds of networks. Hence, Open MPI has btl components for shared
memory, TCP, Infiniband, Myrinetc, etc.
- Component: An MCA component is an implementation of a
framework's interface. Another common word for component is
"plugin." It is a standalone collection of code that can be bundled
into a plugin that can be inserted into the Open MPI code base, either
at run-time and/or compile-time.
- Module: An MCA module is an instance of a component (in the
C++ sense of the word "instance"; an MCA component is analogous to a
C++ class). For example, if a node running an Open MPI application has
multiple ethernet NICs, the Open MPI application will contain one TCP
btl component, but two TCP btl modules. This difference between
components and modules is important becaue modules have private state;
components do not.
Frameworks, components, and modules can be dynamic or static. That is,
they can be available as plugins or they may be compiled statically
into libraries (e.g., libmpi).
| 2. How do I get a copy of the most recent source code? |
See the instructions here.
| 3. Ok, I got a Subversion checkout. Now how do I build it? |
See the instructions here.
| 4. What is the main tree layout of the Open MPI source tree? Are
there directory name conventions? |
There are a few notable top-level directories in the source
tree:
- config/: M4 scripts supporting the top-level
configure script
mpi.h)
- etc/: Some miscellaneous text files
- include/: Top-level include files that will be installed
- ompi/: The Open MPI code base
- orte/: The Open RTE code base
- opal/: The OPAL code base
Each of the three main source directories ([ompi/], orte/, and
opal/) generate a top-level library named libmpi, liborte, and
libopal, respectively. They can be built as either static or shared
libraries. Executables are also produced in subdirectories of some of
the trees.
Each of the sub-project source directories have similar (but not
identical) directory structures under them:
- class/: C++-like "classes" (using the OPAL class system)
specific to this project
- include/: Top-level include files specific to this project
- mca/: MCA frameworks and components specific to this project
- runtime/: Startup and shutdown of this project at runtime
- tools/: Executables specific to this project (currently none in
OPAL)
- util/: Random utility code
There are other top-level directories in each of the three
sub-projects, each having to do with specific logic and code for that
project. For example, the MPI API implementations can be found under
ompi/mpi/LANGUAGE, where
LANGUAGE is c, cxx, f77, and f90.
The layout of the mca/ trees are strictly defined. They are of the
form:
<project>/mca/<framework name>/<component name>/
|
To be explicit: it is forbidden to have a directory under the mca
trees that does not meet this template (with the execption of base
directories, explained below). Hence, only framework and component
code can be in the mca/ trees.
That is, framework and component names must be valid directory names
(and C variables; more on that later). For example, the TCP BTL
component is located in the following directory:
The name base is reserved; there cannot be a framework or component
named "base." Directories named base are reserved for the
implementatio of the MCA and frameworks. Here are a few examples:
# Main implementation of the MCA
opal/mca/base
# Implementation of the paffinity framework
opal/mca/paffinity/base
# Implementation of the pls framework
orte/mca/pls/base
# Implementation of the pml framework
ompi/mca/pml/base
|
Under these mandated directories, frameworks and/or components may have
arbitrary directory structures, however.
| 5. Is there more information available? |
Yes. In early 2006, Cisco hosted an Open MPI workshop where
the Open MPI Team provided several days of intensive
dive-into-the-code tutorials. The slides from these tutorials are available here.
Additionally, an OpenRTE (ORTE) workshop was held for similar purposes
in late 2006. The slides from the ORTE workshop are available here.
There are more questions / answers coming... stay tuned...
|