June 2019

Volume 34 Number 6

[Editor's Note]

A Failure of Process

By Michael Desmond | June 2019

Michael DesmondLast month I wrote about the twin crashes of Boeing’s new 737 MAX airliner and the automated Maneuvering Characteristics Augmentation System (MCAS) that precipitated both events. The investigations that followed point toward significant flaws with MCAS, which repeatedly commanded the aircraft nose down at low altitude as pilots struggled to diagnose the problem. (“Flight of Failure,” msdn.com/magazine/mt833436).

Serious questions remain about MCAS and its behavior in the 737 MAX. For instance, MCAS was able to command 2.5 degrees of travel per increment on the rear stabilizer, yet the documentation submitted to the FAA showed a value of just 0.6 degrees. Likewise, MCAS was designed to activate based on data from a single angle of attack (AoA) sensor mounted on the fuselage of the plane. Yet, any system deemed a potential threat to controlled flight must activate based on multiple sensors.

What’s concerning is that the integration of software into aircraft systems is strictly managed under guidelines such as DO-178B/DO-178C (“Software Considerations in Airborne Systems and Equipment Certification”) and ARP-4754 (“Guidelines for Development of Civil Aircraft and Systems”). Both impose rigorous documentation and review requirements on manufacturers with an emphasis on safety. DO-178 focuses on software development across planning, development, verification, configuration management and quality assurance. ARP-4754 is a system-level process that addresses the integration of software and hardware systems and subsystems.

I spoke with a longtime aviation engineer. He says the layered levels of documentation, testing, review and validation defined in DO-178 and ARP-4754 should ensure that a subsystem like MCAS is implemented in a way that minimizes risk.

“It’s all phases and gates,” he says of the process. “There are checks and balances forward and backward. If a test fails we can go back up the line and see what the requirement was and see if the code was written badly or if the test case was designed badly. It’s a very prescriptive, methodical process when we do this.”

ARP-4754 informs the DO-178 process to determine the level of rigor required for software development around aircraft systems like MCAS. Failure hazard analysis performed under ARP-4754 enables engineers to define failure rates and risk factors for aircraft systems. This data is then used to calculate the Development Assurance Level (DAL) of systems like MCAS in DO-178. The five-point scale—from Catastrophic to No Effect—determines the severity of impact should a system fail in flight, and thus the required level of robustness in the code driving that system.

So what of the disparity between the documented and actual increment of the rear stabilizer? Perhaps field testing showed that MCAS required more authority. However, changing those specs should have kicked off another round of evaluation and testing to confirm MCAS behavior and failure states during flight, and mandated redundant sensor input to protect against inadvertent activation.

One of the emerging lessons of the 737 MAX catastrophe is that a process is only as good as the people and the institutions that execute it. Once undermined, it can lead to dangerous outcomes.


Michael Desmond is the Editor-in-Chief of  MSDN Magazine.


Discuss this article in the MSDN Magazine forum