On Black Swans, Failures-by-Design, and Safety of Automated Driving Systems

Co-authored with Prof. Shai Shalev-Shwartz , CTO of Mobileye, an Intel company

Back in 2014, the Society of Automotive Engineers (SAE) set out six vehicle automation levels, known as J3016, and today is widely accepted as the standard across the industry¹. The classification is based on the division of the driving task between the automatic system and the human driver, rather than on safety requirements. At a time in which Level 3 systems are being introduced to the market, it is imperative that we specify the minimal requirements for a system design which ensures safety when emergency maneuvers might be required. The rapid evolution of self-driving technologies necessitates agreed-upon minimum requirements to avoid putting the entire industry in peril. We would like to propose a methodology for setting those minimal requirements.

We will start with some basic technical definitions and terminology taken from SAE J3016. The operation of driving a car on public roads is referred to as Dynamic Driving Task (DDT) which consists of the perception system (understanding the surrounding environment of the vehicle), a driving policy system (deciding what actions to take next) and vehicle control. The next useful term is Operational Design Domain (ODD) which establishes the envelope of system operation — the scenarios in which the vehicle is able to operate. The ODD may contain any number of usage constraints from the type of roads on which the system can operate (highway, rural, arterial, urban), the kind of maneuvers permitted (straight through junction, protected turns, unprotected turns, etc.), the permitted speed (do not exceed 60km/h for example), limiting constraints by other road users (requirement to have a lead vehicle, for example), weather conditions, and so forth.

Based on the terminology so far, the SAE J3016 document distinguishes between vehicles with some self-driving capabilities to systems which can fully perform the driving task autonomously:

A system is called Automated Driving System (ADS) if it is capable of performing the entire DDT within its ODD. In particular, Levels 3–5 are ADSs while Levels 0–2 are not.²

Meaning that in Levels 0–2, the system is not fully capable of performing the DDT and thus is not considered an ADS, and hence, the driver must be prepared to take control instantaneously. In contrast, in Levels 3–5, the vehicle is responsible to perform the entire DDT, including autonomously handling situations which call for an immediate response, like emergency braking or steering maneuvers. The only difference between Level 3 and Levels 4–5 is that in the former, after several seconds³, the system may ask the human driver to take control and perform what J3016 refers to as “DDT fallback”, while in Levels 4–5 the system should also be capable to perform the “DDT fallback”. That is to say that for L3 the human driver is essentially the fallback in case the ADS is no longer capable to perform the full DDT; whereas the system must provide its own fallback for L4–5.

The exact definition of “DDT fallback” can be found in the J3016 document. For the purpose of this document, the important element is not the “DDT fallback” itself, but the definition of an instantaneous reaction to emergency situations. Since in Level 3 the take-over by the human driver is not instantaneous, it follows that from the perspective of required emergency maneuvers, Levels 3–5 are all the same. In other words, the first observation that we make is that from a system design perspective, Level 3 should handle immediate response of the vehicle autonomously at the same performance level as L4–5. This is true because a take-over request that gives the human driver a few seconds to react does not actually simplify the system design, because during those few seconds anything can happen.

The second observation is that the system design complexity of an ADS is not necessarily tied to the complexity of the ODD. In other words, a simple ODD does not necessarily translate into a simple ADS system design. Many practitioners in the field assume a monotonic relationship between complexity of system design and ODD by viewing Level 3 as a “simpler” problem to solve than Level 4.

The following example is relevant to both observations. Consider an ADS operating in a very simple ODD, say “stay in lane” on highway roads where the ODD excludes lane change maneuvers and is limited to highway roads. Clearly, there are emergency maneuvers which necessitate the ADS stepping outside the designated ODD. For example, in the figure below, car c_r (light blue) is following another car c_f (dark blue), and suddenly c_f makes a last-minute swerve to evade a stationary car (red). The car c_r does not have sufficient distance to avoid the crash by braking, and thus needs to make an evasive maneuver by changing lanes, similar to what a human would do.

If c_r is the ADS, then obviously it must maneuver outside the designated ODD. Failing to do so (sticking to the designated ODD) is an example of a “failure by design” on the part of the ADS because a human driver could have avoided the crash while the ADS did not. Moreover, the ADS does not have a grace period of few seconds to transfer the DDT to the human driver. As mentioned above: in few seconds anything can happen.

This example highlights the notion that once the driver is not responsible for the immediate response of the vehicle, then the system design must be as complex as an ADS with a full-blown ODD. In other words, from a system design perspective, we should expect the same system complexity for an ODD that stays-in-lane and an ODD that performs lane changes. This is not currently the case in L3 systems on their way to market.

Next, we take the insights above and propose a principled methodology for setting the minimal requirements for system design, as a function of ODD, when the ADS is responsible for the immediate response — Levels 3–5.

The engineering process of an Autonomous Driving System involves a perception system (understanding the surrounding environment of the vehicle) and a driving policy system (deciding what actions to do next). In recent years we have introduced Mobileye’s Responsibility-Sensitive-Safety (RSS) model⁴ which covers the driving policy system. Hence, in this document we focus on failures due to errors of the perception system. We distinguish between two sources of errors:

  • “Failure by design”: this means that the perception system has a certain ODD, and outside of it the system is likely to fail. For example, consider the example above of a system that is designed to keep the car in its current lane without the ability to perform lane changes. Suppose that such a system only contains forward-facing sensors and does not contain surround sensors with sufficient resolution to support a lane change in congested traffic. Although the ODD is technically supported by this sensor configuration, such a system cannot perform an evasive maneuver in an emergency that involves steering to another lane at the proficiency level of a human driver. As a result, failing to perform an emergency maneuver is considered to be a “failure by design”.
  • “Black Swans”: the perception can make a mistake within the boundaries of its ODD. For example, a system which is designed to detect vehicles might miss a vehicle occasionally. The designers of the system cannot know in advance that a specific vehicle would be missed by the system. They can only give statistical guarantees on failure cases. We use the term Mean-Time-Between-Failures (MTBF) for the average time the system can operate without having a “black swan” failure. The MTBF is the inverse of the probability of failure.

Our proposal is the following: An ADS, as defined by J3016 (Levels 3–5), must autonomously handle situations which call for an immediate response. Therefore, an ADS must be equipped with a sensory configuration and functional validation so that:

  1. While operating within its ODD, it must have an MTBF of “Black Swan” events at, or above, human crash statistics⁵.
  2. Whenever an immediate response is necessary, the system must not have “failures by design” — that is, it must be able to properly respond to safety-critical events, including evasive maneuvers which take the system outside of its ODD, at, or above, human-level capabilities.

For the first requirement, since human crash statistics are known, the system should be validated statistically to have a sufficiently high MTBF. As for validating no failures-by-design, this can be addressed through a crash typology⁶ study or using the formal model of RSS (for example, the proper response for an evasive maneuver which complies with RSS can be found in Definition 12 of the paper⁷).

In summary, we propose that the coupling between level of automation, ODD, and system design follow the principles below:

  1. From a safety perspective, the level of driver attentiveness is binary- either the human driver has responsibility for the immediate response, or the robotic driver has it. The responsibility “hand-shake” of Level 3 in the J3016 classification should have no real bearing on safety elements of the system design.
  2. The ODD axis is independent of who is responsible for the immediate response.
  3. Regardless of the ODD, the immediate response (of the robotic driver) should handle crash avoidance maneuvers at least at a human level. Failure to do so constitutes a “failure by design”.
  4. Sensor configuration, system design and functional validation should be free of failures-by-design and reach an MTBF of Black Swans of at least the level of human crash statistics.

Without agreed upon principles like these, we are in danger of deploying systems which do not truly promote safer roads. By following these principles in the design of automated vehicles of all levels, industry practitioners will be building and deploying systems which are safer by definition, and which promote the industry as a whole.

[1] Level 0–2: no automation, longitudinal (ACC) and/or Lateral (LKA) — driver responsible and should be prepared to take control instantaneously; Level 3: take-over request time (say, 10s); Level 4: driver is never asked to take-over but operational design domain (ODD) is limited; Level 5: same as Level 4 but with unlimited ODD.

[2] Sections 3.2 and 3.8 in the SAE J3016 document.

[3] Realistically, a human driver with “eyes-off” needs between 5 to 10 seconds to understand the scene and react.

[4] https://www.mobileye.com/responsibility-sensitive-safety/

[5] NHTSA’s recent data indicates human drivers experience a crash once every ~500,000 miles. https://crashstats.nhtsa.dot.gov/Api/Public/ViewPublication/813060

[6] https://www.nhtsa.gov/sites/nhtsa.dot.gov/files/pre-crash_scenario_typology-final_pdf_version_5-2-07.pdf

[7] https://arxiv.org/abs/1708.06374

CEO of Mobileye, SVP at Intel, Co-CEO of OrCam, Chairman of AI21labs & Sachs Prof. of Computer Science at the Hebrew University of Jerusalem