Document Type

Article

Publication Date

May 2015

Publication Title

Reliability Engineering and System Safety

Abstract

Software plays a central role in military systems. It is also an important factor in many recent incidents and accidents. A safety gap is growing between our software-intensive technological capabilities and our understanding of the ways they can fail or lead to accidents. Traditional forms of accident investigation are poorly equipped to trace the sources of software failure, for instance software does not age in the same way that hardware components fail over time. As such, it can be hard to trace the causes of software failure or mechanisms by which it contributed to accidents back into the development and procurement chain to address the deeper, systemic causes of potential accidents. To identify some of these failure mechanisms, we examined the database of the Air Force Accident Investigation Board (AIB) and analyzed mishaps in which software was involved. Although we have chosen to focus on military aviation, many of the insights also apply to civil aviation. Our analysis led to several results and recommendations. Some were specific and related for example to specific shortcomings in the testing and validation of particular avionic subsystems. Others were broader in scope: for instance, we challenged both the investigation process (aspects of) and the findings in several cases, and we provided recommendations, technical and organizational, for improvements. We also identified important safety blind spots in the investigations with respect to software, whose contribution to the escalation of the adverse events was often neglected in the accident reports. These blind spots, we argued, constitute an important missed learning opportunity for improving accident prevention, and it is especially unfortunate at a time when Remotely Piloted Air Systems (RPAS) are being integrated into the National Airspace. Our findings support the growing recognition that the traditional notion of software failure as non-compliance with requirements is too limited to capture the diversity of roles that software plays in military and civil aviation accidents. The identification of several specific mechanisms by which software contributes to accidents can help populate a library of patterns and triggers of software contributions to adverse events, a library which in turn can be used to help guide better software development, better coding, and better testing to avoid or eliminate these particular patterns and triggers. Finally, we strongly argue for the examination of software’s causal role in accident investigations, the inclusion of a section on the subject in the accident reports, and the participation of software experts in accident investigations.

Comments

This is an Author's Accepted Manuscript of an article whose final and definitive form, the Version of Record, has been published in Reliability Engineering and System Safety, 2015 in Volume 137. Find the published version of this article at this link.
SJSU users: use the following link to login and access the article via SJSU databases.

DOI

10.1016/j.ress.2015.01.006

Share

COinS