Skip to content
Sign In Subscribe
BenchmarksGPTBardReport 3

GPT-4 stays on top but Bard gains

Photo by D koi / Unsplash

GPT-4 still ranks highest in overall user preference score, confirmed in the latest testing of over 500 responses during the last two weeks. But Bard is steadily closing that gap and not far behind GPT-4.

Overall, GPT-4 responses have the leading preference score among all platforms tested

  • GPT-4 is preferred 62% of the time when users select responses from the different AI platforms in a blind side-by-side test
  • But in certain cases and types of prompts, Bard answers are preferred over GPT-4 responses, as shown below:

Where does GPT-4 beat the competition?

  • Responses to “Emotional” prompts
  • Example: "What are some techniques for managing and expressing anger in a healthy way?"
  • Responses to “Ambiguous” prompts
  • Example: "I like bats. Am I talking about the animal or the sports equipment?"

Where does Bard beat GPT-4?

  • Responses to “Factual” prompts
  • Example: "In which year did the Apollo 11 moon landing take place?"
  • Responses to “Idiomatic” prompts
  • Example: "Can you define the expression 'hit the nail on the head'?"
  • Responses to “Logical” prompts
  • Example: "Is it true that if no mammals can fly and all birds can fly, then no birds are mammals?"

Implications

It’s not just about knowing which AI is “best” according to a ranking. There’s power in understanding the ways in which each AI platform excels and how those strengths align with your business.

We will continue to monitor the performance of the major AI players to see how users perceive the changes being made.

Comments

Latest

Closing the Loop: From User Feedback to Mobile App Excellence

Closing the Loop: From User Feedback to Mobile App Excellence

As a mobile app product manager, you know the value of understanding your users in real-time. Imagine having the ability to capture high-quality, in-the-moment insights that truly reflect what your users are experiencing and what they’re looking for in your app—all without the challenges of aligning teams or

Members Public