Researchers have discovered a way to take photos using the ambient light sensors found on most mobile devices and laptops. The study generated some fearmongers and click-worthy headlines. While the findings are intriguing and demonstrate the potential for abuse by bad actors, their feasibility as an attack vector using existing technology is severely limited.
Researchers at MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) have developed a way to capture images using only the ambient light sensors found on most mobile devices and many laptops. The study, titled "Imaging Privacy Threats from Ambient Light Sensors," raises a potential security threat because, unlike selfie cameras, there is no setting to turn off that component. Applications also do not require user permission to be used.
"People know that selfie cameras on laptops and tablets sometimes use physical blockers to block them," said Liu Yang, a co-author of a research article published in January in Science Advances. "But with the ambient light sensor, people don't even know that the application is using the data. And this sensor is always on."
Typically, there are not many applications using light sensors because it can only provide data on the arrival of light, which limits its usefulness. Its main function is to provide ambient light data to the operating system for automatic adjustment of screen brightness, but it does provide an API as well. Therefore, developers can access and use it. For example, an application can use the API to turn on low-light mode. The camera app on most devices can do this.
Capturing an image is much more complicated, as it's basically a single-pixel sensor without a lens that measures brightness at about five "frames" per second. To overcome this shortcoming, the researchers sacrificed temporal resolution for spatial resolution, allowing an image to be reconstructed from minimal data.
This process uses a physics principle called Helmholtz reciprocity. This concept states that if a ray of light travels the same path in reverse direction, the reflections, refractions, and absorptions experienced in the ray's path are the same. Simply put, a computer algorithm inverts (inverts) the sensor data to create an image from the perspective of the light source (the display), such as a shadow above the screen.
For this trick to work reliably, the lighting must be specific. Remember, the algorithm uses reverse path tracing from the sensor to the light source (i.e. the display). Therefore, researchers must illuminate specific parts of the screen to obtain a readable image. Since this would produce a very unusual behavior that users might find suspicious, they also replicated this process using a modified Tom and Jerry cartoon to achieve the correct lighting pattern.
The low-resolution images (32x32) produced by this dual photography method are clear enough to show gestures such as two-finger scrolling or three-finger pinching. Due to the extremely low resolution, this technology can only be used on larger displays such as tablets and laptops. The phone screen is too small.
Its biggest drawback is its slow speed. The sensor can only record one pixel at a time, so generating a 32x32 image requires 1024 scans (less than 5 per second). In practical terms, this means it takes 3.3 minutes to generate an image using static black and white mode. Using the modified video method, it took 68 minutes.
The added bonus is that this level of sluggishness is too "cumbersome" to become an attractive attack vector for hackers. It takes 3 minutes to 1 hour to process an image, which is too inefficient. This method can only be used for proof of concept. An attacker would need a much faster sensor for this method to be sufficient to obtain useful information.
Lukasz Olejnik, an independent security researcher and consultant, told IEEESpectrum: "Collection times in minutes are too cumbersome to launch simple and widespread privacy attacks at scale. However, I would not rule out targeted collection of information to enable targeted actions against selected targets."
Even so, without a way to continuously capture multiple hand positions in a short period of time, it's impossible to obtain any useful information, such as a PIN or password.