Skip to content
Sign In Subscribe

Hallucinations: The impact to user experience

Photo by Aedrian / Unsplash

Hallucinations - an industry term used to describe the situation where an AI platform generates responses that are inaccurate or don't make sense but are presented as factual - is an area where platforms and the public may not see the same thing.

  • Only 5% of user interactions rated are seen as containing hallucinations. Most responses are rated as accurate (95%).
    That said, hallucination rates more than double for “experimentation” and “creative” types of prompts:
  • Experimentation: to learn what the AI can do. E.g., tell me what my meal plan should be if I want to grow muscle.
  • Creative: to create something new or unique. E.g., write me a song about fishing in the style of Michael Jackson.
  • In comparing platforms:
  • 17% of users flag GPT-4 responses as possible hallucinations around “creative” prompts.
  • Bard has the fewest reports of hallucinations in creative.

Implication

The gap around a platform’s ability to manage hallucinations will likely gain more attention in three areas:

1) The consumer knows the topic intimately (a plumber would more likely see a hallucination in a response about plumbing vs. aerospace).

2) In topics where higher accuracy is expected such as health, finance, and law.

3) Topics that are often targeted for misinformation campaigns and if any of that is creeping into the training data and responses.

Lastly, another factor impacting hallucination is the recency gap - or new news - as there are time-limits on training data (e.g., GPT-4 leverages training data prior to October 2021, so anything after that would not be included in their results at this time).

Comments

Latest

Closing the Loop: From User Feedback to Mobile App Excellence

Closing the Loop: From User Feedback to Mobile App Excellence

As a mobile app product manager, you know the value of understanding your users in real-time. Imagine having the ability to capture high-quality, in-the-moment insights that truly reflect what your users are experiencing and what they’re looking for in your app—all without the challenges of aligning teams or

Members Public