Data Science

Stop using agents for everything

4 May 2026·5 mins

Agents LLMs Data Engineering AI Productivity Data Science Software Architecture

For anyone working with data — analysts, scientists, engineers — at any level. I’ve seen companies and teams scrambling to adapt to the age of AI. The mandate arrives fast: use agents, build and share skills, move autonomously. The pressure is real, and so is the enthusiasm.

Why I stopped using logistic regression for churn

13 April 2026·8 mins

Data Science Statistics Survival Analysis Python Churn Machine Learning

A few years ago, I built a churn model for a B2B SaaS product. Logistic regression, binary label, 30-day prediction window. It performed fine. The business used it. I moved on. What bothered me was a question the model couldn’t answer: how long does a customer actually stay?

The Data Analyst's Survival Guide to the Agentic Era

8 April 2026·3 mins

Analytics Agents Career Data Science AI

I need to say something that makes some data analysts uncomfortable: the job is changing. Not disappearing — changing. And the analysts who understand the change will thrive. The ones who don’t will spend the next five years fighting it.

Open Source Won the AI Agent War — Here's What That Means for Data Teams

5 March 2026·2 mins

Open Source AI Agents Data Science LLMs

In January 2024, Hugging Face published a benchmark that most people in the data world missed. They compared open-source LLMs against GPT-3.5 and GPT-4 on agent tasks — using a dataset that requires web search and calculator use, the fundamentals of any analytics agent.

The Berkeley AI Lab Figured Out Why Analytics Agents Work (And It's Not About AI)

22 February 2026·3 mins

Analytics Agents AI Data Science

In February 2024, the Berkeley AI Research Lab published a paper that quietly explained everything. Not “how to build AI” — but why the move from single LLM calls to multi-component systems is inevitable. And once you read it, you see analytics differently.

Automatic Dip Detection in Time Series: A Statistical Approach

11 September 2025·6 mins

Data Science Statistics Anomaly Detection Python Time Series

Monitoring availability metrics at scale creates a familiar problem: you have a time series, you need to know when it drops, and you need to know this automatically — without someone staring at a dashboard. This post walks through a statistical algorithm I built to do exactly that. It detects dips in any continuous metric (availability, reachability, error rate) and returns precise start and end timestamps for each event. No ML required — just a modified z-score, two rolling windows, and a few transition rules.

Traffic-Based Customer Segmentation: A Practical Approach with Quantile Bucketing and k-NN Anomaly Detection

18 June 2025·7 mins

Data Science Segmentation Machine Learning Python Feature Engineering

Customer segmentation is one of those problems that sounds straightforward until you actually sit down with the data. In this post I’ll walk through an approach I built for segmenting customers based on their HTTP traffic patterns — the kind of traffic data that tells you not just how much a customer uses a service, but how they use it.