Note: I wrote this article myself. Since English is not my first language, I ran it through ChatGPT to scan for grammar corrections and paraphrasing. I reviewed the output and made my own adjustments before publishing.
If you play Uma Musume: Pretty Derby, you know how deep the rabbit hole goes. Between tracking inheritance metas, calculating skill triggers, and preparing for the next Champions Meeting, the data overhead is insane.
As a developer, I had a rare weekend of free time and decided to build an AI tool to solve this. My first idea was a web app to analyze screenshots and predict win rates. But let’s be honest: manually uploading screenshots after every single run is painful, and building user interfaces is my day job, I didn't want to look at another button on my day off.
So, I built a zero-UI AI Discord bot instead. And I modeled her after Agnes Tachyon.
The Problem With "Free" AI
Instead of rigid slash commands, I wanted it to feel like you were actually talking to Tachyon. I set up the bot to listen for mentions, used Gemini to craft a system prompt that nailed her eccentric personality, and tried to host the whole thing for free on my own machine using Ollama and Podman.
That’s when things went sideways.
Running a local LLM in containers with default settings meant response times were agonizingly slow. It was completely unusable for a real-time chat. To fix it, I had to ditch the local setup and switch to the paid Gemini API, marking the first time I’ve ever actually paid out of pocket for AI tokens.
Watch the Full Build and See It in Action
In my latest video, I break down the exact technical journey, show you how I integrated structured Uma Musume data using AI tools, and show off how the bot actually performs in real-time.
If you want to see how I taught an AI to act like a mad-scientist horse girl (and why my local server almost melted trying to do it), check out the full video below:

