YouTube video summary

Stanford Seminar - Continual Safety Assurances for Learning-Enabled Robotic Systems

Education

07 Dec 202422 min summaryFrom Stanford Online

Stanford Seminar - Continual Safety Assurances for Learning-Enabled Robotic Systems

Stanford Online

Save to your library

Chat with this summary

Introduction and Challenges of Safety in AI and Autonomy

The long-term goal is to develop robot algorithms that operate with guaranteed safety and performance in new and uncertain environments, applicable to various fields such as autonomous drones, cars, and space exploration 30s.
Machine learning and AI are becoming increasingly pervasive in autonomy stacks, particularly for perception, trajectory forecasting, planning, and control, due to their ability to capture real-world complexity 56s.
However, the inclusion of machine learning has also introduced new safety challenges, such as the need for safety assurances in systems that operate in uncertain environments 1m21s.
Recent incidents, such as Cruise cars being rolled back from San Francisco and robots crashing into humans on factory floors, have highlighted the severity of these safety challenges 1m28s.
The US government has passed an executive order to prioritize discussion on AI safety, and other governments have followed suit, emphasizing the need for safety assurances in AI and autonomy 1m36s.
There is a tension between enabling systems to leverage machine learning capabilities while maintaining safety, which is the focus of the discussion 1m48s.
Most machine learning systems are designed without specific regard to safety, and safety issues are often addressed with post-hoc solutions, referred to as "safety bandages" 2m14s.
These safety bandages are not scalable, can be conservative and degrade performance, and may not work in new deployment conditions 2m49s.
To overcome these challenges, safety is viewed as a continuous process, formally ingrained in different stages of the learning process, from design and training to deployment and iterative improvement 3m26s.
Algorithms are developed to programmatically incorporate safety requirements in the training process, learning inherently safe and robust controllers and policies for robotic systems, referred to as "design time safety methods" or "design for safety" 3m47s.
The goal is to develop methods for detecting out-of-distribution and anomalous situations in learning-enabled robotic systems, adapting their behavior to maintain safety, and learning from past failures to improve safety over time 4m13s.
A continual safety assurance framework is proposed, where assurances are provided during design time, monitored and adapted during operation time, and continuously improved over the system's life cycle 4m40s.
The framework involves learning provably safe controllers from data, adapting these controllers online under new deployment conditions, and stress-testing policies or controllers to minimize safety-critical failures 5m2s.

Safety Analysis and Reachability Analysis

Safety analysis aims to determine whether and how a robot can prevent its trajectory from entering an undesirable set of states, referred to as the failure set 5m39s.
Two key aspects of safety analysis are quantifying which configurations of the robot are doomed to fail versus which are safe, and how to keep the robot in safe configurations, referred to as the "whether and how of safety" 5m55s.
Control theory provides powerful frameworks for safety analysis, including Hamilton-Jacobi reachability analysis, which is used to mathematically characterize and automatically compute safety requirements 6m20s.
Hamilton-Jacobi reachability analysis assumes a robotic system with dynamics, state, control, and disturbance, and computes the backward reachable tube, a set of initial states from which the robot will be driven to an undesirable set of states despite best control efforts 6m55s.
The backward reachable tube represents the unsafe configurations for the robot and should be avoided, as illustrated by a simple example of a quadrotor moving longitudinally up and down in a room 7m41s.
A system's trajectory can be analyzed to determine if it will eventually crash into a ceiling or floor, making it impossible to avoid collision, or if it can be controlled to stay within a safe set, with the contrast between these two scenarios represented by a light red region and a blue region, respectively 8m2s.
The blue region, or safe set, is the area where the system has a controller or policy to keep it inside at all times, and reachability analysis can provide both this safe set and the safety controller 8m15s.
Reachability analysis involves defining a failure set implicitly with a function L of x, which is negative inside the failure set and positive outside, representing the safety reward the robot gets at state x 8m41s.
The cumulative reward of a trajectory is given by the minimum safety reward along the trajectory, which is different from typical optimal control and reinforcement learning problems where the cumulative reward is the sum of rewards 9m15s.
The sign of the cumulative reward can be used to determine whether the system trajectory was ever safe or not, and control can be used to formulate a game between the control and disturbance to maximize the safety reward 9m36s.
The value function corresponding to this game captures the closest the system will ever get to the failure set, and if the value function is negative, the system must have entered the failure set at some point 10m18s.
The backward reachable tube, or unsafe set, is the set of states where the value function is less than or equal to zero, and this value function can be computed using the principle of dynamic programming, resulting in a partial differential equation 10m43s.
The partial differential equation, known as the Hamilton-Jacobi-Bellman equation, relates how taking a particular action affects the system's value or distance to the failure set, and solving this equation provides the value function 11m21s.
The value function can be visualized, with more red indicating a more negative value function and more blue indicating a more positive value function, representing the states closer to the ceiling or floor 11m29s.
Hamilton-Jacobi reachability analysis is a method that can capture the requirements of safety analysis, providing a set of safe states and a controller to keep the system inside those safe states, with the help of a safety value function V, whose sign indicates whether the system is safe or unsafe, and whose gradient tells a safe controller for the system 13m9s.
The safety value function is more negative towards the lower end than the positive end, even though the failure set is both the ceiling and the floor, because gravity is pushing the system more in one direction, making it more unsafe when closer to the floor 12m7s.
The safety controller is a function that, at any state X, pushes the system towards higher and higher values, which means more safe, by aligning with the gradient ascent direction of the value function 12m47s.

Neural Approximations of Safety Value Function and Deep Reach

The Hamilton-Jacobi reachability analysis can be applied to general nonlinear autonomous systems, but it has challenges, including scalability, as it is computationally hard to scale these methods beyond even five-dimensional systems, and it is not immediately clear how to interface these methods with real-world data and machine learning models 13m46s.
To address these challenges, neural approximations of the safety value function can be used, which involves learning a neural approximation of the safety value function that takes as input the state and time of the system and outputs the corresponding safety value function 14m43s.
The Deep Reach method is a self-supervised learning method that relies on the fact that the true safety value function must satisfy a partial differential equation, which can be used as a signal to train the safety value function 15m6s.
In the Deep Reach method, the safety value function is trained by randomly sampling some state and time, propagating it through the neural network, computing the value function, and computing the PD violation error, which is then used to backpropagate and update the neural network 15m27s.
The goal is to optimize the neural network parameters to learn a more accurate representation of the safety value function, which is consistent with a partial differential equation, allowing for the explicit inclusion of safety requirements in the training process and the learning of inherently safe controllers from data 15m42s.
This approach has two key advantages: it can explicitly bake in safety requirements and learn inherently safe controllers, and neural representations are easily scalable to higher dimensional systems, enabling the synthesis of safe controllers for a broader class of autonomous systems 16m1s.
The three aircraft conflict resolution problem is used as an example, where two evader vehicles and a pursuer vehicle with uncertain behavior need to be safeguarded against, and the failure set is defined as any configuration where two aircraft are in close proximity 16m29s.
Due to the high dimensionality of the problem, a direct computation of the unsafe set is not scalable, and a common approach is to compute pairwise collision sets between aircraft and take their union as an approximation 17m1s.
However, with Deep Reach, it is possible to directly compute the high-dimensional unsafe set, capturing more configurations than the approximation, including three-way interactions between aircraft that could not be captured earlier 17m39s.
The Deep Reach framework has also been applied to autonomous driving, specifically in the context of urban driving, where an autonomous car needs to navigate around a stranded vehicle in its lane while avoiding oncoming traffic 18m36s.
In this scenario, a learning-based controller was initially used, but it was not able to capture the nuanced intersection point between the two cars, leading to collisions, whereas Deep Reach can bake in safety requirements to prevent such collisions 19m12s.
A safety controller was demonstrated in a car scenario, where it automatically adjusted its behavior to oncoming traffic, making the white vehicle wait behind the standard vehicle if the orange driver was aggressive, and then crossing the lane once it was safe 19m31s.
This behavior emerged automatically from the learning-based system because the safety requirements were embedded in the learning process.
In the latest work, Deep Reach was applied for learning safe controllers for lagged locomotion, a hybrid system with both continuous and discrete controls 20m29s.
The safety controller successfully avoided collisions with obstacles, even when deliberately pushed into a collision by Shuang, and sometimes changed its walking pattern completely to maintain safety 20m53s.

Probabilistic Safety Assurances and Conformal Prediction

To provide safety under neural network representations, guarantees were put on hold, and probabilistic safety assurances for Deep Reach were explored 21m19s.
The key idea behind probabilistic assurance is that the learned safety value function induces a candidate safe policy for the robot, which should ideally achieve the same value as the original policy 22m3s.
The gap between the two value functions can be used to calibrate the learning error, and finding a bound on the maximum learning error is of interest 22m25s.
To provide assurances, computing the bound Delta is necessary, and various methods have been explored, including neural network verification methods and scenario optimization 23m3s.
Conformal prediction is a method that is particularly exciting due to its simplicity and beauty, and it will be discussed further 23m17s.
Conformal prediction provides a probabilistic bound on Delta, which cannot be computed exactly but can be approximated with high confidence, resulting in a high-confidence safe set after correcting the value function 23m42s.
This method is illustrated in the context of a multi-vehicle collision avoidance problem, where the learned backward reachable tube is corrected by conformal prediction to obtain a certified safe set 24m20s.

Adapting Neural Safety Representations and Online Adaptation

Neural safety representations combine traditional safety analysis methods with dynamic learning capabilities, allowing for the incorporation of safety constraints in the learning process and scalability to higher-dimensional systems 24m47s.
These representations have two key advantages: they enable the incorporation of safety constraints in the learning process and are easily scalable to higher-dimensional systems 25m0s.
However, in real-world scenarios, safety constraints, dynamics, control authority, and environment are subject to change, requiring the system to dynamically adapt safety controllers to maintain system safety 25m27s.
Neural safety representations can be adapted online by inputting additional data, such as uncertain system or environment parameters, to learn the safety value function as a function of these parameters 25m58s.
This approach is referred to as parameter-conditioned safety value functions, which can be quickly activated online to maintain safety in response to changing deployment conditions 26m13s.
An example of this approach is demonstrated in a simple drone delivery problem, where the drone must navigate to its goal location while avoiding collisions with obstacles in uncertain wind conditions 26m34s.
By learning a parameter-conditioned value function as a function of the uncertain wind intensity, the safe set corresponding to different wind conditions can be obtained, allowing the drone to adapt its route accordingly 26m55s.
The drone can start with a more direct route to its goal location in low wind conditions but adapt its route in response to changing wind conditions to maintain safety 27m15s.
A robotic system's safe set can change due to external factors such as high wind intensity, and using parameter condition representation, the safety controller can quickly adapt to take a longer but safer route to its goal location 27m36s.
The safety value function can be adapted directly from high-dimensional observations such as RGB images or LiDAR scans, allowing the robot to dynamically construct its safety value function and maintain safety in unknown environments 28m4s.
This approach is called observation condition reachable set, and it has two key adaptation components: dynamically adapting the safety value function using LiDAR scans to adapt to unknown obstacles, and constantly estimating the uncertainty in the robot's dynamics 28m30s.
The uncertainty estimation component allows the robot to adapt its safety value function to account for factors such as rugged terrains or slippery surfaces, and it can be used as a safety filter for a nominal policy 28m52s.
The framework is agnostic to the underlying nominal policy and can be used with various RL-based policies, MPC-based policies, and in different environments with dynamic obstacles and adversarial humans 29m59s.

Stress Testing Learning-Based Policies and Identifying Visual Failures

The neural safety representations can enable adaptation to dynamic obstacles and environment uncertainty, and can be used to stress test learning-based policies to identify data regimes where they might cause safety-critical failures 30m28s.
Stress testing is particularly important for vision-based controllers, which are becoming increasingly ubiquitous in autonomy stacks but are challenging to handle using traditional safety analysis methods due to their high dimensionality and complicated nature 31m21s.
The problem of stress testing learning-based policies can be formalized to identify data regimes where they might cause safety-critical failures 31m35s.
A robotic system with dynamics and a visual sensor is considered, where the sensor provides visual observations such as RGB images or point clouds, which are used as input by a vision-based controller to apply control to the robot 31m37s.
The goal is to find the set of images or point clouds that lead to the failure of the overall closed-loop system, not just the vision-based controller 32m7s.
The failure discovery problem is cast as a reachability problem, and the corresponding backward reachable tube is used to extract visual failures 32m25s.
The robot's sensor function and vision-based controller are cascaded to obtain an equivalent state-based policy for the robot, which simplifies the closed-loop system for stress testing 32m39s.
The backward reachable tube is computed to find the set of all states that will lead to failure despite the best control action, but in this case, a specific controller is used instead of the best control action 33m11s.
The backward reachable tube is used to find the images seen by the robot along the failure states, which gives the visual failures for the system 33m48s.
An example of this work is the collaboration with Boeing, which designed a vision-based controller for an autonomous aircraft taxiing on a runway using only RGB images from a camera on the right wing 34m3s.
The goal is to keep the aircraft on the runway, and the failure set is defined as outside the runway boundary 34m27s.
The backward reachable tube is computed for the aircraft under the vision-based controller, and the starting configurations that will lead to failure are shown in red, while the safe configurations are shown in blue 34m37s.
Representative images that will cause failure are shown, and analysis of one such image reveals that the vision-based controller confuses runway markings with the center line of the runway, causing the aircraft to drive towards the marking 34m56s.
A proposed framework can identify semantic failures in vision-based control systems, such as those that lead to system-level failures, but not component-level failures, as seen in the example of an aircraft leaving a runway due to a vision-based control failure 35m33s.
The framework can detect prediction errors in vision-based controllers, with high prediction errors not always leading to system-level failures, and low prediction errors sometimes triggering system-level failures 35m53s.
The goal is to target component-level errors that lead to system-level failures, and the framework can combine with parameter condition reachable sets to obtain failures as a function of different environment latents, such as time of day or cloud conditions 36m31s.
For example, a state that is a failure during the morning due to runway markings may be safe at night, improving aircraft safety, and the framework can find failures as a function of cloud conditions, providing a diverse set of failures 36m41s.
The framework was applied to an autonomous indoor navigation pipeline, which was trained entirely in simulation and worked well in the real world, but had interesting failure modes, such as the vision-based controller learning a correlation between light-colored surfaces and traversability 37m13s.
This correlation led to the controller thinking it could go through light-colored walls, resulting in a collision, and the framework can be used to identify such failures and improve the vision-based controller 38m12s.
The ultimate purpose of identifying failures is to use them to improve the vision-based controller, such as by training an anomaly detector 39m1s.
A detector can be used to determine the probability of failure of a system given an image, and if the anomaly detector triggers a fail, a fallback controller can be used to preserve safety in an online mode 39m10s.
The detector flags a failure input, slowing down the robot, and once the failure is resolved, the system goes back to the learning-based controller to maintain system safety 39m32s.
Designing the anomaly detector and fallback controller are interesting questions, and a catalog of failures can provide a starting point for addressing these questions 39m52s.
Targeted incremental training of the controller on failure data can improve performance, as shown by a significant reduction in unsafe volume after training 40m11s.

Other Safety Research Projects and Imitation Learning

The key goal of the research is to design autonomous systems that can leverage modern machine learning methods while maintaining safety, considering safety in different stages of the learning process 40m33s.
Other projects in the lab include Deep Reach, a paradigm to learn controllers for robotic systems, and exploring other learning paradigms such as imitation learning 41m14s.
Imitation learning suffers from the compounding error problem, and to address this, researchers have been injecting adversarial disturbances during data collection to learn safety-aware imitation learning policies 41m28s.
The adversarial disturbance is computed using reachability analysis and pushes the system towards safety-critical states, allowing the robot to learn corrective actions from those states 42m1s.
The hypothesis is that by visiting more safety-critical states, the robot can learn to correct errors and improve safety during test time 42m25s.
A hypothesis is proposed to recover from certain states, and an experiment is conducted using the same aircraft texting problem to test the imitation policy, which goes from image to action, and the results show that injecting safety information significantly improves the safety of the test policy at test time 42m59s.
The demonstration and rollout of the policy for the vanilla method and the safety-guided imitation policy are compared, showing that the vanilla method cannot recover from errors near the boundary of the runway due to lack of data, while the safety-guided method can learn recovery behavior from such states 43m37s.
The data used for both methods are exactly the same, and the safety-guided imitation policy is also applied to a real crazy fly, resulting in better sim-to-real transfer, specifically in the context of safety 44m12s.
The idea of using safety-critical information to guide the exploration of a learning agent is proposed as a way to design more robust policies, and an example is shown using safety information to guide the exploration of a sampling-based planner, such as MPPI 44m48s.
The use of safety information can also help with exploration during the design phase, achieving the same performance with much fewer samples, and this direction is being actively pursued 45m27s.

Language-Based Safety Constraints and Vision-Language Models

The challenge of determining who gives the safety constraints and who tells what is safe or unsafe is discussed, and the use of language as a medium for flexibly defining these constraints is proposed 45m39s.
A vision-language model is used to convert natural language feedback into physical constraints that are more interpretable by traditional safety analysis methods, such as reachability analysis or control barrier functions, to design a safety controller that adheres to these natural language feedback 46m12s.
An experiment is conducted using a vision-language model to convert user-given safety constraints into a safety constraint, and a safety controller is computed to avoid a coffee spill on the floor 46m34s.
The concept of using tools like VMS and LLMs to define rich safety constraints and semantic safety constraints for robotic systems is discussed as a starting point for further development 47m6s.

Further Exploration of Safety Value Function Parameterization and Scalability

The parameterization of the value function for safety is explored, with the possibility of using latent parameters instead of explicit physical parameters, allowing for more complex inputs to be processed 48m35s.
The use of encoder-decoder frameworks or simultaneous learning can be employed to preprocess complex inputs and make them more manageable for the safety value function 49m7s.
The scalability of deep BRE (Bayesian Reinforcement Exploration) is discussed, with the complexity of the computation scaling with the dimensionality of the system, but with the potential for better performance if the value function has a lower-dimensional manifold 49m31s.
Deep BRE relies on the fact that many systems of interest have a lower-dimensional representation of the value function, allowing it to overcome some of the challenges associated with the curse of dimensionality 50m10s.
The possibility of using smarter gridding methods to improve the performance of vanilla HJB (Hamilton-Jacobi-Bellman) is mentioned, with the idea that using a parametric representation with fewer parameters could be more efficient 50m29s.

Challenges in Incremental Training and Monotonic Improvement

The concept of training a vision-based controller using anomalies detected during training is discussed, with the question of whether such a controller would be transferable to a different environment, such as a different runway 50m46s.
Incremental training of networks can lead to failures in certain regions of the state space where the system was previously fine, due to the lack of monotonic improvement in machine learning methods as more data is added 51m1s.
This issue is significant for safety, as it means that the performance of the system may not be completely monotonic over the previous dataset, even if the overall metric improves 51m41s.
The lack of monotonic improvement in incremental training makes it challenging to ensure safety, as the system may fail in some regions even if it was previously safe 52m12s.
Researchers are exploring ways to achieve monotonic improvement in machine learning models as more data is added, which is an open question in the field 52m36s.

Co-optimizing Safety and Performance and Levels of Safety

Safety and performance are often at odds, and current methods for ensuring safety, such as safety filtering, can be myopic and do not consider the long-term effects of actions on performance 53m17s.
A potential solution is to design controllers that co-optimize safety and performance using dynamic programming, which can optimize both requirements simultaneously 53m36s.
There is a trade-off between computation and the level of safety assurance, and researchers may need to compromise on safety to achieve more scalable computation 54m8s.
Companies often think of safety in terms of levels or tiers, rather than absolute safety, but the methods for computing these levels can be ad hoc or heuristic-based 54m44s.
Levels of safety have a hierarchy, with different severities associated with various incidents, such as collisions, and this hierarchy is considered when evaluating safety levels 54m53s.
There is limited academic work on the hierarchy of safety levels, but it is an interesting area of study 55m11s.

Adversarial Situations and Game Theory in Autonomous Driving

An example of an adversarial situation is when a car is convinced to squeeze into a space, and this situation raises questions about the symmetry of controllers in such scenarios 55m25s.
In a hypothetical situation where a human and an autonomous vehicle (AO) swap controllers, it is unclear whether the system would still work, and this scenario enters the realm of game theory 55m37s.
The assumption of symmetric information is made, assuming that both cars can be controlled and that there is a level of cooperativeness between drivers to avoid collisions 55m55s.
In reality, oncoming traffic cannot be controlled, and the assumption of cooperativeness may not always hold true 56m7s.
The avoidance of adversarial situations by human drivers may be more robust than expected, and there is a distribution of behaviors that can be classified as either adversarial or cooperative 56m33s.
A behavior-level classification of vehicles is often performed by analyzing their past trajectories to determine whether they are behaving cooperatively or adversarially 56m43s.
Based on this classification, different planning strategies are employed to navigate around the vehicle, and this process involves a level of behavior planning 56m54s.
Cars in the city are not necessarily shy and have become more assertive over time 57m10s.

Made with Recall · in 3 seconds

Get a summary like this for anything you read, watch or save.

Recall summarizes any link you paste, then keeps it in your personal library so you can search, chat with it, and never lose a key idea again.

YouTube videosArticlesPodcastsPDFsAnything else

Save this summary

Keep it in your library.

Save to your library

Browse all from Stanford Online →

Stanford CS153 Frontier Systems | The Road Ahead: Resilience Required

Stanford CS153 Frontier Systems | The Road Ahead: Resilience Required

YouTube02 Jun 2026

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 7 - Evaluation

Artificial Intelligence

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 7 - Evaluation

YouTube02 Jun 2026

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 8 - Trending Topics

Artificial Intelligence

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 8 - Trending Topics

YouTube02 Jun 2026

Stanford CS153 Frontier Systems | The AI Native Company: How One Founder Becomes a 1000x Engineer

Entrepreneurship

Stanford CS153 Frontier Systems | The AI Native Company: How One Founder Becomes a 1000x Engineer

YouTube25 May 2026

Stanford CS547 HCI Seminar | Spring 2026 | HCI and Human-Centered AI for Digital Health

Health & Medicine

Stanford CS547 HCI Seminar | Spring 2026 | HCI and Human-Centered AI for Digital Health

YouTube25 May 2026

Stanford CS25: Transformers United V6 I Distinct Modes of Generalization from Parameters and Context

Artificial Intelligence

Stanford CS25: Transformers United V6 I Distinct Modes of Generalization from Parameters and Context

YouTube25 May 2026

Ready to get started?

Save, summarize and chat with your content.

IT'S FREE

No credit card required · 30 Day Refund on Premium · 24 Hour Support

Recall web app on laptop, personal AI knowledge base for summarizing and chatting with your content