skip to main content
article

Achieving 100% availability: In The ERAM Air Traffic Control System

Published:05 April 2023Publication History
Skip Abstract Section

Abstract

Fault tolerance is a key requirement for En Route Automation Modernization (ERAM), the FAA's system that manages En Route air traffic over the USA. A system failure could lead to hundreds of flights being delayed or cancelled. Using experience from earlier systems a set of techniques were built into ERAM at inception, including a hot standby copy of each executable and the latest state checkpointed in disk files. As the system matured through formal testing and operational experience at the first sites (2010 - 2015), the goal of 100% availability was not achieved so additional techniques were added. These included exception safety, runaway process protection, and proactive monitoring of the system to detect defects and often resolve them without the air traffic controllers being aware. With the implementation of these additional techniques the FAA has measured ERAM as 100% available from October 2016 at all 20 operational sites. Software fault tolerance techniques have been well documented [2]; this extended abstract describes the specific techniques that have led to ERAM achieving continuous 24x7 availability for 6 years.

References

  1. A good description of exception safety is in https://en.wikipedia.org/wiki/Exception_safetyGoogle ScholarGoogle Scholar
  2. See the section on Process Pairs in https://ntrs.nasa.gov/api/citations/20000120144/downloads/20 000120144.pdfGoogle ScholarGoogle Scholar
  3. Our implementation of process pairs followed the work of Dr. Flaviu Cristian. See the section on failure masking in server groups in http://csis.pace.edu/~marchese/CS865/Papers/cristian93under standing.pdfGoogle ScholarGoogle Scholar
  4. See for example http://www.ganssle.com/blog/blog/on-nversion- programming.htmlGoogle ScholarGoogle Scholar
  5. Fault-Tolerance in the Advanced Automation System (Cristian, Dancey, Dehn).Google ScholarGoogle Scholar
  6. See the route (item 15) description in https://flightcrewguide.com/wiki/rules-regulations/flight-plan/Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM SIGAda Ada Letters
    ACM SIGAda Ada Letters  Volume 42, Issue 2
    December 2022
    87 pages
    ISSN:1094-3641
    DOI:10.1145/3591335
    Issue’s Table of Contents

    Copyright © 2023 Copyright is held by the owner/author(s)

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 5 April 2023

    Check for updates

    Qualifiers

    • article
  • Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)1

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader
About Cookies On This Site

We use cookies to ensure that we give you the best experience on our website.

Learn more

Got it!