Researchers from MIT, Microsoft, and Adobe collaborated to develop a “visual microphone”, which reconstructs the sounds heard in the room by measuring the vibrations caused by sounds in everyday objects.
In one of the experiments, the system recovered intelligible human speech from the vibrations of a potato chip bag filmed from five feet away through soundproof glass. It also managed to reconstruct clear sounds from a sheet of aluminum foil, the surface of a glass of water, and even from the leaves of a potted plant.
“When sound hits an object, it causes the object to vibrate,” says Abe Davis, a graduate student at MIT and a member of the research team. “The motion of this vibration creates a very subtle visual signal that’s usually invisible to the naked eye. People didn’t realize that this information was there.”
The researchers developed an algorithm that analyzes images of objects captured on video. The software analyzes the images frame by frame and measures tiny fluctuations, such as changes in the color of each pixel. Based on these changes, the system analyzes the movements of an object as it vibrates slightly under the influence of sound waves.
Even using the images from an ordinary camera, which captures up to 60 frames per second, the algorithm can reconstruct the sounds accurately enough to reveal basic information, such as the number and gender of speakers.
The performance increases dramatically as the sampling rate, i.e. the number of frames recorded per second (fps), rises. In some of their experiments, the researchers used high-speed camera operating at 2-6000 fps.
The vibrations analyzed by the algorithm are actually invisible to the eye as their amplitude is around ten micrometers (thousandths of a millimeter). Even in an image taken from very close, a distance of 10 microns corresponds to much less than one pixel.
The technical details of the algorithm will be presented at the Siggraph conference. With regard to practical applications that the technique could have, the researchers admit that the first thing that comes to mind is… spying.
However, they have some other suggestions for the practical use of the algorithm. “We’re recovering sounds from objects. That gives us a lot of information about the sound that’s going on around the object, but it also gives us a lot of information about the object itself, because different objects are going to respond to sound in different ways,” says Davis.
ABOUT THE AUTHOR
Typos, corrections and/or news tips? Email us at [email protected]