Why We're Building Driftora
Day 1 of a public research log on the KBO betting market.
Day 1 of a public research log on the KBO betting market.
I’ve played and followed football for more than twenty years. I know the game the way you only can after that long — the runs, the spacing, the moment a match tilts. And here is the embarrassing part: my predictions are almost always wrong.
Two decades of watching, and I still can’t reliably tell you who wins on Saturday. That bothered me far more than it should have.
So a while ago I tried to fix it the only way that made sense to me. I started collecting football data — building my own little datasets, trying to turn what I “knew” into something I could actually measure. I’m not a quant or an engineer; my background is in design. I’m just someone endlessly curious about how things work, drawn to problems where the reasoning is clear, where you can point at why something is true. I assumed football would be one of those problems.
It isn’t.
Why football broke my model
Here’s the wall I kept hitting. Imagine a striker rated 99 for pace. Now put a defender rated 99 for pace right in front of him. On paper, it’s a wash. But whether that striker actually gets through depends on the pass he receives, the link-up play around him, the angle of his shot, the defender’s positioning, the weather, how either of them slept last night — a stack of hidden variables sitting underneath two identical numbers.
The stats describe the players. They don’t explain the outcome.
For someone who likes things with a clear, defensible logic, football is almost cruel. The signal is real, but it’s buried under so many interacting variables that I could never isolate it. Twenty years of intuition, and a model I couldn’t make honest.
Why baseball, and why this log
So I changed the question. Instead of starting with the hardest possible sport, what if I started with one where the variables are more contained?
Baseball is that sport. It’s a sequence of discrete, mostly one-on-one events — pitcher versus batter, repeated thousands of times. The data is cleaner. The causes are easier to separate. It is, frankly, a far better place to learn whether you can model a sport at all before you go back and try the messy ones.
And rather than the leagues everyone already picks apart, I went to one closer to home and less exhaustively analyzed: the Korean Baseball Organization (KBO).
For the last while I’ve been building this quietly — a database, a way to turn bookmaker odds into clean probabilities, a method for measuring how the market behaves. Driftora is where I stop building in private and start showing the work, in public, as it happens.
This is a research log. It will read like one.
What this is — and what it is not
Let me be clear about the thing I am not doing, because the internet is full of it.
I’m not here to sell picks. No “lock of the day,” no five-star plays, no screenshots of wins with the losses quietly deleted. That genre runs on confidence without evidence, and I find it exhausting.
What I’m actually interested in is narrower and, I think, more honest: how well does the KBO betting market price the games it offers? Bookmakers set odds. Those odds imply probabilities. The question is whether those implied probabilities are accurate — and if they drift from reality, whether the gaps are large enough, and stable enough, to mean anything.
That’s it. Before I ever claim to beat a market, I want to prove — out loud, with numbers — that I understand how it already behaves.
The one principle underneath all of it: data integrity first. Before there’s a model, there has to be data you can trust. Before there’s an edge, there has to be a baseline you can measure against. When something is unverified, I’ll say it’s unverified. When a sample is too small to conclude anything, I’ll say so. And when something fails, the failure becomes the next entry — because for an outsider learning in public, the failures are the most useful thing I can offer.
Where this goes
Over the coming weeks, I’ll work through it in order, showing the work each step:
How accurately the market actually predicted its own outcomes — the bar to beat.
Whether the market is well-calibrated: when it says 60%, do those teams really win about 60% of the time?
Where it drifts — favorites, underdogs, upsets.
What features might carry real signal, tested one at a time.
And whether any of it survives honest, out-of-sample validation — the step where most models quietly die.
I don’t know yet whether there’s an edge in this market. That’s the point. I’m going to find out in the open, and write down exactly what I find.
Signal over noise. Let’s begin.
This is the first entry in an ongoing research log. Next: how I built a research-grade database of KBO odds — and why clean data is harder than it sounds.

