Recent
Why I stopped using logistic regression for churn
A few years ago, I built a churn model for a B2B SaaS product. Logistic regression, binary label, 30-day prediction window. It performed fine. The business used it. I moved on.
What bothered me was a question the model couldn’t answer: how long does a customer actually stay?
The Data Analyst's Survival Guide to the Agentic Era
·3 mins
I need to say something that makes some data analysts uncomfortable: the job is changing. Not disappearing — changing. And the analysts who understand the change will thrive. The ones who don’t will spend the next five years fighting it.
From Monolith to Modular: Rebuilding a Billing Data Pipeline From Scratch
Earlier this year I shipped a pipeline rewrite I’m genuinely proud of. It replaced a 2,200-line SQL monolith — one of those files that everyone’s afraid to touch — with a clean layered architecture that handles 14 products, runs daily, and can be extended by adding a handful of config files.
Open Source Won the AI Agent War — Here's What That Means for Data Teams
·2 mins
In January 2024, Hugging Face published a benchmark that most people in the data world missed. They compared open-source LLMs against GPT-3.5 and GPT-4 on agent tasks — using a dataset that requires web search and calculator use, the fundamentals of any analytics agent.
The Berkeley AI Lab Figured Out Why Analytics Agents Work (And It's Not About AI)
·3 mins
In February 2024, the Berkeley AI Research Lab published a paper that quietly explained everything. Not “how to build AI” — but why the move from single LLM calls to multi-component systems is inevitable. And once you read it, you see analytics differently.