Ideation

Back in November of last year, I began work on the Paradox Engine. I was a couple months into my software engineering career, and it took me until getting a job to learn how to structure Python projects outside of Jupyter notebooks. Armed with this newfound power, I went out into the world to begin work on what has, so far, been the most fulfilling project to work on in my spare time.

I love Homestuck. I love that damn webcomic with those four kids and their weird internet friends and their magic video game. I love it enough that I've been on a Homestuck roleplaying server for literally the past decade. They let me run the place two years in. It's a setting that allows you to insert yourself pretty easily, even though one of the story's interpretations is "you really don't want to be here or see yourself as any of these people". Andrew Hussie got a degree in computer science and uses it as a "source of raw material for absurd kinds of exploitation", which shows up a lot through the early parts of the comic. I got into it when I was 11, right around when I started deleting METAINF and going into the appdata folder to install Minecraft mods, so it was only a short road to majoring in computer science down the line.

This first part doesn't really have anything to do with computer science. It's more like psychology. Everyone likes psychology! Imagine getting to know more about yourself, and other people in the most general sense. Imagine putting people in boxes, and then putting yourself in a box, and then saying the box you're in is better than the boxes other people are in. This is probably a base human drive stemming back thousands of years. Astrology, MBTI, Enneagram, classpecting. It's the natural progression.

Classpecting is the typology system present in Homestuck. Every player of SBURB, the aforementioned "magic video game" that creates a universe, gets a Title, or "classpect", that determines what powers the characters have and what role in the story they play. "Classpect" is a portmanteau of "class" and "aspect", where "class" describes the impact the character has on their aspect, and "aspect" describes a section of reality. For example, John Egbert, the main character of Homestuck, has the title "Heir of Breath", where "Heir" is the class and "Breath" is the aspect.

We don't know much about classpecting from the comic. Andrew Hussie has walked back classpecting explanations and definitions from the comic. We don't even know how many classes or aspects there are! It turns out making a system that can place every person neatly into 144 (or more) boxes is hard.

Hey, you know who does feel qualified to neatly categorize every single person into a box?

Scraping uQuiz

I'm an engineer. That means I solve problems.

My good friend made "which class are you" and "which aspect are you" quizzes. There are very many, any search engine says so, but I'm inclined to trust her the most, considering she is the inspiration for this bot. She already made a bot for alchemy at this point in time, but there was no classpecting. These quizzes are on uQuiz, and asking her to give me the raw data for her quizzes is a tall ask, so I decided to learn something new and scrape a quiz site that looks like it came straight from 2009.

If something is on the internet, it can be scraped. APIs exist specifically to satiate and deter insane people like myself from taking down sites by accident in order to gather data. Now that companies are using all that scraped data to build word-predictors that are surprisingly good at making money themselves, site-runners forgot this fact, and scrapers forgot how to build well-designed scrapers that don't crash sites, but scraping will happen anyway without active deterrence measures. There's one on this site. You probably saw an anime girl for half a second before reading this post. That's not really a moral statement, I just think you should see a cute anime girl and mine fake Bitcoin1 before reading my post because it's funny.

The uQuiz API is an undocumented mess which is a mix of JSON and HTML. There's an API which gives limited info about a quiz, and then a static API that actually delivers quiz info. You get the quiz ID from the quiz link, then call /api/quizstatus/{quiz_id} to get the specific quiz version ID and total question count. https://uquiz.com/static/Quiz/sc/{quiz_id}/{quiz_version_id} returns a JSON that lists every possible result ID you can get, and you can parse the HTML of https://uquiz.com/Result/static/lite/{quiz_id}/{quiz_version_id}/personality/{result_id} to see every result page and get the result's real name, for ease of reading and processing later.

Now you have the result names, and a mapping that turns result IDs to result names. https://uquiz.com/static/Quiz/qs/{quiz_id}/{quiz_version_id}/1/{total_question_count} returns JSON that contains HTML, one of the most disgusting API return structures I have ever seen or dealt with, that can be reconstructed into a list of questions that add "points" to results behind the scenes. Whichever result has the most points depending on the answers the user gives wins. That's how uQuiz works.

To save uQuiz's servers (and my own ass), I'm not giving away the code for this, but I just told you how to do it and what the API endpoints are, and this took a night of reverse-engineering. Not an impossible project. At the end of this, I had two JSON files that allowed me to run the class or aspect quizzes on my own computer, and a Python file that lets me hotswap those quizzes with any of the myriad on uQuiz when I want different answers.

LLM (Lol, Literal Magic)

I wanted to do classpecting since I found out you could generate coherent text with algorithms in 2018. Back then, my imaginary mechanism looked more like fine-tuning some model on 100 to 1000 crowd-sampled examples or desparately trying to train one from scratch, but now I'm going to hack one together from the large language models they let just about anyone have access to for some reason.

How do you use a quiz to give a character a classpect? Simple, just make an LLM pretend to be the character and then take the quiz. Close enough. I already have the quiz JSON, I just have to import it and have the LLM pick an option for each question. LLMs aren't very good at answering definitively, which is why OpenAI (and other model providers) made "structured output", which forces the LLM to pick from a set of preexisting options or follow a structure specified using JSON. To use this easily, you need Pydantic types. I need to specify these at runtime because that's when I import the JSON file to turn it into something usable. You're not supposed to specify models at runtime.

Guess what I did.

The Engine

The Paradox Engine is a distributed computer that performs arithmetic on souls.

That's the lore, at least. Really, it's a Discord bot that retroactively justifies an answer given by a uQuiz.

This is how personality quizzes work. You pick all the answers to the questions, get your result, and then say "wow, that's so me!" and make up why you got to that result after the fact. In astrology, you only have to ask one question ("when were you born?") and you can get all sorts of numbers and charts that explain why you, or your jilted lover, act exactly the way you or they do. The size of the input is entirely uncorrelated to the size of the output. Personality justification is in the mind of the beholder. This is what LLMs are perfect for: generating plausible bullshit, much to everyone else's chagrin. Me? I love plausible bullshit, bring it on! I wrote a personality for a "robot on acid" and had it give a justification for why "it chose" the title it did for the inputted character, based on imported data about the class and aspect chosen. I took that info from another friend, Tamago, written back when we were both into classpecting way more than we should have been. With her permission, I edited it down for clarity and used it as the knowledge-base for this bot. Now I can put "RAG" on my resume!2

Then I used a Python package for a Discord API and used a janky nohup and disown command to run it in the background on some Linode server. This is still the best way to run a Python bot in the background, for some reason.

Reflections

The Paradox Engine was a hit, until my friend finally made her own classpect bot which is quite frankly better than mine. Mine was biased toward a couple classes (Thief/Rogue) and aspects (Blood/Doom/Time) for unknown reasons; it could be the LLM, or the distribution of the quiz answers assuming the LLM is assigning answers uniformly randomly. I don't feel like doing statistics to find out, so you'll have to bear with a lot of duplicates. At least you'll have justification for why these answers are 100% correct; after all, they were given to you by a distributed computer that performs arithmetic on souls. Supposedly.

2

"It's not RAG unless it uses a vector store!" Shut up, I did that too for something else.