Imagine there are two pilots on an airplane, one human and one computer. Both people have their "hands" on the controller, but they are always focusing on different things. If they're all focused on the same thing, humans can take the wheel. But if humans get distracted or miss something, computers quickly take over. Combining human intuition with machine precision creates a more symbiotic relationship between pilot and aircraft.


With Air-Guardian, a computer program can track a human pilot's gaze (using eye-tracking technology) to better understand what the pilot is paying attention to. This helps the computer make better decisions based on what the pilot is doing or intending to do. Photo credit: AlexShipps/MITCSAILviaMidjourney

This is the "SkyGuardian" system developed by researchers at MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL). Modern pilots need to deal with large amounts of information from multiple monitors, especially at critical moments. Air-Guardian can act as an active co-pilot; it is a partnership between humans and machines that is fundamentally based on understanding attention.

But how exactly does it determine attention? For humans, it uses eye tracking, while for neural systems, it relies on a concept called "salience maps," which pinpoint the direction of attention. These maps can serve as visual guides, highlighting key areas in an image to help grasp and interpret the behavior of complex algorithms. Air-Guardian uses these attention markers to identify early signs of potential risks, rather than only intervening when a safety breach occurs, as is the case with traditional self-driving systems.

The system's widespread impact extends beyond aviation. Similar cooperative control mechanisms may one day be used in cars, drones and the wider field of robotics.

Lianhao Yin, a postdoc at MIT CSAIL and first author of a new paper on Air-Guardian, said: "An exciting feature of our approach is its differentiability. Our cooperation layer and the entire end-to-end process are trainable. We specifically chose the causal continuous deep neural network model because of its dynamic properties in mapping attention. Another unique feature is adaptability. The Air-Guardian system is not rigid, it can be adjusted according to the actual situation, ensuring balanced cooperation between humans and machines."

Field tests and results

In field tests, both the pilot and the system made decisions based on the same raw images when navigating to a target waypoint. Air-Guardian success is measured by cumulative rewards earned during the flight and the shorter path to the waypoint. Guardian reduces the risk level of flying and increases the success rate of navigating to the target point.

Ramin Hasani, a member of MIT's CSAIL research institute and inventor of Liquid Neural Network, added: "This system represents a human-centered approach to AI aviation innovation. Our use of Liquid Neural Network provides a dynamic, adaptive approach to ensure that AI does not simply replace human judgment, but complements it, thereby improving safety and collaboration in the skies."

Technical foundation and future prospects

Air-Guardian's real strength lies in its underlying technology. It uses an optimization-based collaboration layer that leverages human and machine visual attention, as well as a liquid closed-form continuous-time neural network (CfC) known for deciphering cause-and-effect relationships, to analyze incoming images for important information. Complementary, the VisualBackProp algorithm identifies the system's focus in the image, ensuring a clear understanding of its attention map.

To be widely used in the future, the human-machine interface needs to be improved. Feedback suggested that an indicator, such as a bar graph, might provide a more visual representation of when the monitoring system begins to take control.

SkyGuard heralds a new era of safer skies, providing a reliable safety net for those moments when human attention wavers.

"The SkyGuard system highlights the synergy between human expertise and machine learning, furthering the goal of using machine learning to enhance pilot capabilities and reduce operational errors in challenging scenarios," said Daniela Rus, the Andrew (1956) and Erna Viterbi Professor of Electrical Engineering and Computer Science at MIT, director of CSAIL, and senior author of the paper.

"One of the most interesting results of using visual attention metrics in this work is the potential for human pilots to intervene earlier and improve interpretability," said Stephanie Gil, assistant professor of computer science at Harvard University. "This shows a great example of how to use artificial intelligence to work with humans, lowering the threshold for achieving trust by leveraging natural communication mechanisms between humans and artificial intelligence systems."