Depositphotos
A group of american researchers from the University of Pennsylvania has discovered, that the weak vibrations, created by the phone’s speaker, can be used to remotely listening to conversations.
In a new study, american scientists have demonstrated, that it is possible to decrypt phone calls, using radar data, obtained at a distance of about 3 meters from the phone. It is noted, that the accuracy of this decryption remains limited and is about 60% for an average of 10 thousand words. The study is based on a 2022 project, in which researchers used a radar sensor and software to recognize voice and remotely decipher 10 words, letters, and numbers with an accuracy of 83%.
“When we talk on a cell phone, we tend to ignore the vibrations transmitted through the speaker that make the whole phone vibrate. If we pick up on these vibrations with remote radars and apply machine learning to recognize what is being said using contextual clues, we can recognize entire conversations”, — explains the study’s first author Suredai Basak.
Suredai Basak and his supervisor Mahant Gowda, co-author of the study from the department computer science and engineering, used a millimeter-wave radar sensor — the same device used by unmanned vehicles, motion sensors, and 5G wireless networks, to study the potential of compact devices based on radar. The researchers sought to reduce the size of these devices to the point where they could be integrated into objects, such as ballpoint pens.
According to the researchers, the facility they have developed is intended solely for experimentation and was created with the potential actions of attackers in mind. They adapted Whisper, a large-scale open-source speech recognition model powered by artificial intelligence, to decode vibrations into known speech transcriptions.
“Over the past three years, there has been a huge breakthrough in artificial intelligence capabilities and open source speech recognition models. We can use these models, but they are more focused on pure speech or everyday tasks, so we have to adapt them to recognize low-quality, noisy radar data”, — emphasized Suredai Basak.
To move from noise-filled data to speech recognition without retraining the entire model Whisper, the researchers used an adaptation method known as low-rank adaptation. This allowed the scientists to specialize the model for radar data by retraining only 1% of Whisper’s parameters.
To record the vibrations, the researchers used a millimeter-wave radar sensor located a few meters away from the phone to pick up subtle surface vibrations when you play back speech through the speaker. To analyze the data, they transferred this radar signal to their modified version of the Whisper speech recognition model, which provided up to 60% accuracy. The researchers emphasize, that decryption accuracy can be increased by applying manual correction based on context, for example, by adjusting certain words or phrases if previous information about the conversation is known.
The authors of the study compared their own method with lip reading. Although lipreading can only pick up about 30-40% of the words spoken, many lipreaders use contextual clues to decipher enough information to participate in a conversation.
“The purpose of our work was to find out whether these tools could potentially be used by attackers to listen to phone conversations from a distance. Our results show that this is technically feasible under certain conditions, and we hope that this will raise public awareness so that people can be more careful during confidential calls”, — the researchers conclude.
Results of the study were presented by during this year’s ACM conference
Source: TechXplore
Контент сайту призначений для осіб віком від 21 року. Переглядаючи матеріали, ви підтверджуєте свою відповідність віковим обмеженням.
Cуб'єкт у сфері онлайн-медіа; ідентифікатор медіа - R40-06029.