The Statistical "Which Character" Personality Quiz has been overbuilt beyond the functionality of the quiz, for the purpose of making an interesting and bespoke dataset. It is hoped that this dataset can be useful for teaching statistics and data science with a topic that students should should already be familiar with and find interesting: fictional characters.
The dataset is based around 2,000 objects (fictional characters) and 500 features (bipolar adjective pairs). Since 2019, millions of people have rated the fictional characters, themselves, their relationship to the characters, and the relationships between characters in a number of different formats across a number of different surveys.
All of these datasets were collected with the same workflow. Users would take the personality quiz and at the end they are asked if they would be willing to answer a research survey before they view their results (about 40% do). You should take the quiz yourself to get a good grasp of how this works.
There are several different datasets, each corresponding to a different supplemental survey. Each dataset includes the user's self reports from the quiz (somewhat degraded for user protection), their responses to the supplemental survey, and then a few demographic questions that are more or less standard across surveys, plus technical information about how that user interacted with the survey. Each dataset has a detailed coodbook, here is a general overview of the premise of each survey: