IQ uses a combination of two methods to understand how users interact with and perceive different AI platforms: in-the-moment user experiences and survey responses.
There are two parts that the IQ Research Participants go through:
- Part 1: Self-Directed (SD) Exercise
- Part 2- Side by Side (SBS) Comparison
Part 1 - Self-Directed (SD) Exercise
A user is invited to participate in the study on the Pulse Labs Power Portal. They are assigned one of five platforms to use: ChatGPT4, ChatGPT3.5, Bard, Bing, or Claude.
- Users are given 2-3 minutes to try out one or more prompts on their assigned platform, which they do in the native web experience for that platform. The type of prompt and questions are up to the user, and no suggestions are provided.
- After recording a video and uploading it to the Pulse Labs Power Portal, users answer 6-7 questions on metrics related to their experience using their assigned platform, such as satisfaction, NPS, choice drivers, type of query, theme of query, etc.
- On completing this, the user is directed to Part 2 of the study - Side-by-Side Comparison
Part 2 - Side-by-Side (SBS) Comparison
Here each user is presented with blind responses from a random pair of the platforms being tested, all from a series of specially chosen questions and they’re asked to indicate which one they prefer.
- Users are presented with different pairs of responses to the same question
- They are asked to choose which one they liked better
- They are then asked to explain their preference through survey questions
- Each user sees 5 sets of questions across different themes
- The order of the questions and options are randomized to reduce bias
- Each platform is shown an equal number of times across users
Preference Score Calculation on SBS
Based on the choice a user makes, a preference score is calculated for each platform. This is calculated as a percentage:
(No. of times a platform was preferred)/ (No. of times a platform was shown) x 100
- A formula was used to calculate the preference score for each of the four platforms. The scores were then analyzed by demographics, technographics, and type of question to look for trends and patterns.
- Significance Tests were done on the data at a confidence level of 95%.