“The king is dead”—Claude 3 surpasses GPT-4 on Chatbot Arena for the first time

Illustrate a sunny, optimistic 3:2 aspect ratio image inspired by cheerful and bright animation techniques. In the central part of the image, depict two speech bubbles with symbols evoking AI language models. One of the speech bubbles is slightly larger and glowing, indicating its victory in a friendly competition. In the background, show a score chart representing Chatbot Arena, with rising and falling lines, symbolizing the shifting power dynamics in the AI language model space. On the periphery, sketch abstract symbols of ChatGPT and Google's Gemini Advanced as spectators, signifying their presence in the competition.

Anthropic’s Claude 3 Opus large language model (LLM) has surpassed OpenAI’s GPT-4 on Chatbot Arena for the first time, marking a significant moment in the AI language model space. The victory of Claude 3 over GPT-4 has garnered attention on social media, with software developer Nick Dobos tweeting “RIP GPT-4.” Chatbot Arena, run by Large Model Systems Organization (LMSYS ORG), is a platform where users rate the outputs of two unlabeled LLMs, helping to calculate the “best” models in aggregate and populate the leaderboard. This is crucial for researchers who struggle to measure the performance of AI chatbots due to their varying outputs. The rise of Claude 3 has led to some users replacing ChatGPT in their daily workflow, potentially impacting ChatGPT’s market share. Additionally, Google’s Gemini Advanced is gaining traction in the AI assistant space, posing competition for OpenAI. Despite this, OpenAI is preparing to release a major new successor to GPT-4 Turbo, possibly named GPT-4.5 or GPT-5, indicating that the AI language model space will continue to be full of competition and interesting shakeups on the Chatbot Arena leaderboard in the future.

Full article

Leave a Reply