Рубрики NewsTechnologies

The future of autonomous cars is in jeopardy — AI models turned out to be sociopaths

Published by Oleksandr Fedotkin

A study conducted by Johns Hopkins University researchers has shown that humans outperform AI in accurately describing and interpreting social interactions in dynamic environments.

It is noted that this is crucial for technologies such as self-driving vehicles and robot assistants, which are largely focused on artificial intelligence systems for safe navigation in real-world environments. The authors of the study emphasize that existing AI models have problems understanding the nuances of social dynamics and non-verbal cues necessary for effective interaction with people. The study results indicate that these limitations may be caused by the very structure of modern AI models.

«For example, the AI in a self-driving car needs to recognize the intentions, goals, and actions of drivers and pedestrians. You would want it to know which way a pedestrian is going to walk, whether two people are talking, or whether they are about to cross the street. Whenever you want AI to interact with people, you want it to be able to recognize what people are doing. I think this sheds light on the fact that these systems can’t do that right now», — explains lead author of the study, associate professor professor of cognitive science at Johns Hopkins University, Leila Isik.

To find out how much AI models close to human perception, the researchers made the study participants watch short three-second videos in which people performed various tasks together and independently, demonstrating different aspects of social interaction. The study participants were asked to rate the characteristics important for understanding social interaction on a scale from 1 to 5.

The researchers then used more than 350 large-scale language models, as well as generative AI models by asking them to predict how people would rate short videos and how they would react to them. In addition, the big language models were also asked to evaluate short human-authored subtitles for the videos.

Most of the people who participated in the study agreed with each other on all questions. However, this was not the case with the AI models. The models designed to generate videos failed to accurately describe what the people in the videos were doing. Even the image generation models, which were supposed to analyze a series of static frames, could not fully predict whether the people in the video were talking to each other or not. Speech models were better at predicting human behavior, while video models were better at predicting neural activity in the brain.

«It’s not enough to just see an image and recognize objects and faces. This was the first step that brought us far ahead in AI development. But real life is not static. We need AI that understands what is happening in front of it. Understanding the relationships, context, and dynamics of social interaction — is the next step, and this research suggests that there may be a blind spot in the development of AI models», — says study co-author and PhD student Katie Garcia.

According to the researchers, this situation is due to the fact that AI neural networks were created on the basis of those parts of the human brain that process static images. They differ significantly from the brain regions that process dynamic social interaction.

Scientists have concluded that no AI model is currently able to adequately respond to human behavior in a dynamic social environment. The researchers note that existing AI models lack some fundamental aspect that allows the human brain to accurately and quickly interpret aspects of dynamic social interaction.

Source: SkiTechDaily