OpenAI's GPT-4.5 artificial intelligence passes the Turing test «more often than real people»,

Researchers from the University of California, San Diego, have demonstrated for the first time the passage of artificial intelligence systems a three-way version of the classic Turing testThe test is named after British mathematician and computer scientist Alan Turing. Turing suggested that one way to assess a machine's intelligence is to have it engage in a text conversation with a human who is simultaneously corresponding with another person who is out of sight. If the person asking the question is unable to determine where the computer is and where the person is, it means that artificial intelligence can think like a human being..

In particular, the test involved Meta’s LLama 3.1-405B AI model, OpenAI’s GPT-4o and GPT-4.5 models, and the world’s first virtual assistant program ELIZA, developed about 80 years ago by Joseph Weizenbaum. Among them, the GPT-4.5 version was perceived by test subjects as another real person in 73% of cases.

As part of the three-way version of the classic Turing test, two people and one person participated in each round of communication artificial intelligence system. People had to determine whether they were communicating with an AI or another real person based on the results of the conversation. Participants received two parallel dialogues with a human and an AI, for 5 minutes they could ask any questions and use any strategies to understand where everyone was, and then they had to vote.

The researchers tested each model 2 times in two basic system scenarios. In the first case, the AI was openly tasked with misleading the human participant and making him think that he was communicating with a real interlocutor. In the second case, the AI was not given specific instructions on what to do, but was simply asked to imitate a young man interested in internet culture and using slang in communication.

It was the second scenario that was the most successful artificial intelligence systems performed better. In a scenario where AIs were openly tasked with misleading real interlocutors, GPT-4.5’s success rate was only 36%. While posing as a young man, this model successfully managed to mislead both real interlocutors and achieved a 73% success rate.

Штучний інтелект GPT-4.5 від OpenAI проходить тест Тьюринга «частіше, ніж справжні люди», — дослідження — Cameron R. Jones, Benjamin K. Bergen

Meta’s LLama 3.1-405B AI achieved 56%, which is very good, but still closer to random guessing. But interestingly, the ELIZA results (23%) were better than the GPT-4o results (21%).

«People were unable to distinguish between people from GPT-4.5 and LLaMa And 4.5 was even rated as human much more often than real people!», — comments on the results of the study its lead author, researcher at the Language and Cognition Laboratory at the University of California, San Diego Cameron Jones.

Despite the fact that the Turing test itself is largely outdated for real-world assessment of mental abilities of modern artificial intelligence systems, this study clearly demonstrates how advanced AI systems trained on a large number of human-composed texts have learned to imitate us in a realistic way. Even if artificial intelligence does not understand the essence of a question, it is already capable of producing a plausible enough version of the answer to make us really believe that we are facing a human.

«I think the results provide more evidence that AI can replace humans in short-term interactions without anyone noticing This could potentially lead to job automation, improved social engineering processes, and a more general social upheaval», — Cameron Jones believes.

The smartest AI Google Gemini 2.5 Pro suddenly became free for everyone

Currently conducted research is still awaiting review.

Source: Futurism

OpenAI's GPT-4.5 artificial intelligence passes the Turing test «more often than real people», — research