News

GPT-4.5 and LLaMa-3.1 Excel in Turing Test Performance

GPT-4.5 and LLaMa-3.1 Excel in Turing Test Performance

April 02, 2025
Large Language Models Turing Test GPT-4.5 LLaMa-3.1 Artificial Intelligence Persona Prompt
Advanced Large Language Models like GPT-4.5 and LLaMa-3.1 demonstrate remarkable ability to mimic human-like conversation, with GPT-4.5 achieving a 73% win rate in the Turing Test.

LLMs and the Turing Test: GPT-4.5 and LLaMa-3.1 Performance

Recent studies have evaluated the performance of Large Language Models (LLMs) in the Turing Test, specifically focusing on GPT-4.5 and LLaMa-3.1. The results provide significant insights into their ability to mimic human-like conversation.

Key Findings

  • GPT-4.5 with Persona Prompt: Achieved a win rate of 73%, meaning it was judged to be human 73% of the time. This performance was significantly higher than chance and even outperformed real human participants in some cases.
  • LLaMa-3.1 with Persona Prompt: Achieved a win rate of 56%, which was not significantly different from the human baseline, indicating that interrogators could not reliably distinguish it from a human.
  • Baseline Models (ELIZA and GPT-4o): Performed significantly below chance, with win rates of 23% and 21% respectively, confirming that interrogators could easily identify these models as non-human.

Implications

The results suggest that advanced LLMs like GPT-4.5, when prompted to adopt a humanlike persona, can pass the Turing Test. This has profound implications for debates on artificial intelligence, particularly regarding the social and economic impacts of systems that can convincingly mimic human behavior.

Study Details

The study involved randomized, controlled, and pre-registered Turing Tests with independent populations. Participants engaged in 5-minute conversations with both a human and an AI system before judging which was human. The findings highlight the importance of persona-based prompts in enhancing the human-like quality of AI responses.

For more detailed information, you can access the full study here.

Sources

Large Language Models Pass the Turing Test - arXiv We evaluated 4 systems (ELIZA, GPT-4o, LLaMa-3.1-405B, and GPT-4.5) in two randomised, controlled, and pre-registered Turing tests on ...
GPT 4.5 Passes the Turing Test: Study - Analytics India Magazine In a three-party Turing Test, an interrogator converses with both a human and a machine to accurately identify the human. The research tested ...
GPT-4.5 Passes Turing Test with Persona - AIbase GPT-4.5 achieved a 73% "human-passing" rate in both tests, exceeding human success rates (typically 60%-70%), becoming the first AI model to ...
SunMonTueWedThuFriSat
303112345678910111213141516171819202122232425262728293012345678910
:
PM
SunMonTueWedThuFriSat
303112345678910111213141516171819202122232425262728293012345678910
:
PM