A new study from a group of researchers has revealed that AI models have officially passed the Turing Test. Alan Turing, the developer of the Turing Test, originally called it an Imitation game, which tests a machine’s ability to exhibit intelligent behaviour equivalent to that of a human. Four AI models, including ELIZA, GPT-4o, LLaMa-3, and GPT-4.5, were tested out. According to the study, Cameron R. Jones and Benjamin K. Bergen from the Department of Cognitive Science at UC San Diego, two out of the four models have officially passed the test. If you are someone who wants to know which of the 4 AIs passed the test, the following article is for you.
AI Models Pass Turing Test: What Is The Turing Test?
Alan Turing developed the Turing Test in 1949, which was originally called the Imitation Game. He designed this test to determine whether a machine language can exhibit human-like behavior. The researchers did both the 2-party Turing Test and the 3-party Turing Test on these AI models. During the 2-party Turing Test, the participants talked to either an AI or a human. They then reveal, based on the conversation of whether they think they talked to AI or a human. During the 3-party Turing Test, the interrogators talk with both AI and a human, with both trying to convince the interrogators that they are not AI. In the end, the interrogators chose who they thought was a human.
PERSONA Vs. NO PERSONA
According to what the research revealed, the researchers conducted the test with two types of prompts for the AI. The first prompt is a basic NO PERSONA prompt, and the other is a PERSONA prompt. During the first prompt, the AI models were prompted to convince the interrogators that they are human in a Turing Test. However, in the second prompt, the researchers prompted the AI models on exactly what kind of persona to adopt during the test. The research papers also revealed that there were 254 participants who underwent 8 rounds of tests and filed an exit survey.
AI Models Pass Turing Test Results
The authors of the study released the results of their study through a detailed research paper. According to the results of the test, when prompted to adopt a human-like persona, GPT-4.5 was judged as a human 73% of the time. This is significantly more times the interrogators actually selected the real human participants. On the other hand, LLaMa-3.1 was termed human by the interrogators 56% of the time. The researchers concluded their detailed study explaining how these results constitute the first empirical evidence that any AI passes a standard three-party Turing test.
What Cameron Jones Said About The Research
Cameron Jones, a postdoc in the Language and Cognition Lab at UC San Diego, raised many important questions regarding the research. He took over his X account to explain how the study offered strong evidence that LLMs did pass the Turing test devised by Alan Turing for AIs reaching human intelligence back in 1950. Cameron mentioned that the Turing Test is pretty vague about exactly how it should be implemented. In another one of his many X posts, he also questioned whether LLMs really pass if they needed a prompt. He added that without a prompt, the LLMs would fail the test for trivial reasons.
Public Reactions
Public reactions flooded the internet once the study was made public. Gary Marcus, a cognitive scientist, stated the paper wasn’t entirely convincing, setting the bar artificially low. He added that the declaration of victory is therefore premature. Other users on the internet were also skeptical about the study and its results. One of the users stated how it would be a lot harder to pass the test if the setting were talking in depth about interests. Another user suggested the AI models should take the Voight-Kampff test. It is basically a fictional tool in the Blade Runner universe. The Blade Runners determine if the person is a human or a replicant in this test. It involves measuring emotional responses to provocative questions.