The person and the philosophy behind the model
My name is Thomas. I studied mathematics, physics, and computer science, and that combination of disciplines shaped the way I look at pretty much everything — through numbers first, narrative second. A few years ago I found my way into women’s sports, initially out of curiosity, and quickly realized how much untapped analytical depth existed in leagues that receive far less modeling attention than the NFL or the NBA. Since then I’ve gotten involved in nearly every professional women’s league where a sportsbook will take your action.
I make my living betting on sports. That sentence can sound different depending on how you read it, so let me be clear about what it means in practice: I build statistical models, compare the probabilities they produce to the lines offered by sportsbooks, and place wagers when the numbers suggest an edge. There is no gut feeling involved, no “lock of the day,” and no chasing losses. It is a disciplined, math-driven process — the same kind of quantitative thinking you’d find in any data science role, just applied to a different market.
Of all the sports I model, tennis holds a special place. The sheer volume of matches is unmatched. The WTA and Challenger tours run nearly twelve months a year, which means there is almost always a live dataset to work with, a prediction to evaluate, and a feedback loop to learn from. That constant stream of data is invaluable for anyone trying to build and refine a predictive model — you don’t have to wait a week between sample points the way you would in football.
Tennis also happens to be one of the few sports where sportsbooks still allow meaningful bet sizes on the lines I’m interested in. Higher limits mean the edges I find can actually be realized at scale, rather than being capped at trivial amounts. Combine that with the individual nature of the sport — no lineup changes, no rotation decisions, fewer confounding variables — and you get a uniquely clean environment for statistical modeling.
At its core, DataDrivenPicks is built on a point-level Markov chain framework. Rather than trying to predict the final score directly, the model estimates the probability that each player wins a point on serve, then simulates the match forward through games, sets, and tiebreaks to arrive at match-level win probabilities, expected game totals, and spread lines. These serve probabilities are adjusted for surface, opponent strength (using a custom Elo system), and recent form.
Everything feeds from a database of point-by-point match data that I’ve built and maintain. The pipeline is fully automated: matches are scraped, parsed, and loaded; Elo ratings are updated; predictions are generated; and once a match finishes, the results are settled and the performance metrics you see on this site are recalculated. The goal is to remove as much human intervention as possible so the model can be evaluated on its own merits.
DataDrivenPicks exists to make this work transparent. You can see every prediction the model generates, track its accuracy over time, explore the Elo rankings that drive the inputs, and dig into the data behind the numbers. I’m not selling picks or running a subscription service. This is a window into a real, working model — the same one I use to bet with every day. If the numbers are good, they’ll speak for themselves. If they’re not, that will be visible too.
If you want to follow along or have questions, you can find me on X.