Skip to content

sandialabs/Fenix

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

391 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

 ************************************************************************


            _|_|_|_|  _|_|_|_|  _|      _|  _|_|_|  _|      _|
            _|        _|        _|_|    _|    _|      _|  _|
            _|_|_|    _|_|_|    _|  _|  _|    _|        _|
            _|        _|        _|    _|_|    _|      _|  _|
            _|        _|_|_|_|  _|      _|  _|_|_|  _|      _|


 ************************************************************************

About

Fenix is a software library compatible with the Message Passing Interface (MPI) to support fault recovery without application shutdown. Fenix has three components: process, data, and message recovery. Process recovery is used to repair communicators whose ranks suffered failure detected by the MPI runtime. Data recovery is an optional feature that can be used to implement a high-performance in-memory checkpoint/restart mechanism. Message recovery is another optional feature that can be used to save and replay message logs to enable localized fault tolerance.

See our documentation.

Installation

These instructions assume you are in your home directory.

  1. Checkout Fenix sources
    • For example: git clone <address of this repo> && cd Fenix
  2. Create a build directory.
  3. Specify the MPI C compiler to use. Open MPI 5+ is the required version.
    • Check out the CMake documentation for the best information on how to do this, but in general:
      • Set the CC environment variable to the correct mpicc,
      • Invoke cmake with -DCMAKE_C_COMPILER=mpicc,
      • Add the mpi install directory to CMAKE_PREFIX_PATH.
    • If you experience segmentation faults during simple MPI function calls, this is often caused by accidentally building against multiple versions of MPI. See the FENIX_SYSTEM_INC_FIX CMake option for a potential fix.
  4. Run cmake ../ -DCMAKE_INSTALL_PREFIX=... && make install
  5. Optionally, add the install prefix to your CMAKE_PREFIX_PATHS environment variable, to enable find_package(fenix) in your other projects.
// ************************************************************************
//
// Copyright (C) 2016 Rutgers University and Sandia Corporation
//
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
// the U.S. Government retains certain rights in this software.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// 1. Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
//
// 2. Redistributions in binary form must reproduce the above copyright
// notice, this list of conditions and the following disclaimer in the
// documentation and/or other materials provided with the distribution.
//
// 3. Neither the name of the Corporation nor the names of the
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY RUTGERS UNIVERSITY AND SANDIA 
// CORPORATION "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, 
// BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND 
// FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL 
// RUTGERS UNIVERSITY, SANDIA CORPORATION OR THE CONTRIBUTORS BE LIABLE 
// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 
// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 
// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 
// LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY 
// WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY 
// OF SUCH DAMAGE.
//
// Authors Marc Gamell, Matthew Whitlock, Eric Valenzuela, Keita Teranishi, Manish Parashar
//        and Michael Heroux
//
// Questions? Contact Matthew Whitlock ([email protected])
// ************************************************************************

About

Fenix: A Portable, Flexible Fault Tolerance Programming Framework for MPI Applications

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors