Claude takes the top spot in AI chatbot ranking — finally knocking GPT-4 down to second place

erek

[H]F Junkie
Joined
Dec 19, 2005
Messages
10,903
OpenAI surpassed

Claude 3 Opus, the next-generation artificial intelligence model from Anthropic has taken the top spot on the Chatbot Arena leaderboard, pushing OpenAI’s GPT-4 to second place for the first time since it launched last year.



Unlike other forms of benchmarking for AI models, the LMSYS Chatbot Arena relies on human votes, with people blind-ranking the output of two different models to the same prompt.



OpenAI’s various GPT-4 versions have held the top spot for so long that any other model coming close to its benchmark scores is known as a GPT-4-class model. Maybe we need to introduce a new Claude-3 class model for future rankings.“

Source: https://www.tomsguide.com/ai/claude...g-finally-knocking-gpt-4-down-to-second-place
 
Is Claude woke, too? Or getting woke with all the human fortifying (votes)?
 
  • Like
Reactions: erek
like this
It’s somehow wokely racist.
Like it will call you a slur, but it will be positive about it while reaffirming your life choices.
But it has some very strong opinions about D&D races I tell ya what.

I think this shit is a pretty accurate reflection of what happens when a brain is all full up with internet.
 
From Tom's linked article above:

"Meta, which is heavily focused on open source AI, is expected to release Llama 3 in the next few months which will likely enter in the top ten as it is expected to be similar in ability to Claude 3 — after all Meta has 300,000 + Nvidia H100 GPUs to train it on."

300k H100s, wow. What is the production scale of the new GB200s?
 
From Tom's linked article above:

"Meta, which is heavily focused on open source AI, is expected to release Llama 3 in the next few months which will likely enter in the top ten as it is expected to be similar in ability to Claude 3 — after all Meta has 300,000 + Nvidia H100 GPUs to train it on."

300k H100s, wow. What is the production scale of the new GB200s?
And should it be successful it will be yet another win for Nvidia.
 
except GPT4 Turbo is already out and Claude isn't better than that

1711681663970.png
 
  • Like
Reactions: erek
like this
except GPT4 Turbo is already out and it isn't better than that and it isn't included on the benchmarks
I think the 1106 preview used here is the turbo version:

https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo
gpt-4-1106-preview: GPT-4 Turbo model featuring improved instruction following

Considering GPT-4 age and who being Claude (from the ex-openAI staff and google-amazon-many billions in funding) not too surprising, GPT 5 must be about to launch, they are apprently starting to try to build what could train GPT 6.
 
  • Like
Reactions: erek
like this
I think the 1106 preview used here is the turbo version:

https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo
gpt-4-1106-preview: GPT-4 Turbo model featuring improved instruction following

Considering GPT-4 age and who being Claude (from the ex-openAI staff and google-amazon-many billions in funding) not too surprising, GPT 5 must be about to launch, they are apprently starting to try to build what could train GPT 6.

Let me rephrase slightly since the footnote picture covers it - the current version, which was released in January, is seemingly still ahead of everyone.

I'm glad they're basically caught up though, there isn't really an enormous gap. Bring on the competition.
 
Back
Top