In August 2025, Meta published an engineering blog post that changed how I think about analytics agents. It’s called “Creating AI Agent Solutions for Warehouse Data Access and Security,” and it describes a multi-agent system they built for their internal data warehouse.
Most coverage focused on the security angle. I want to focus on something more fundamental: the cookbook they built for agents to follow.
The Problem: Tribal Knowledge at Scale #
Meta has thousands of data assets, hundreds of teams, and strict privacy requirements. The challenge is that knowing what data to use requires expertise that lives in people’s heads. Which tables are restricted? What are the approved alternatives? What does this data actually measure?
Traditionally, this knowledge was tribal — you asked your teammate. At Meta’s scale, that doesn’t work.
Their solution: encode tribal knowledge as text resources that agents can read. Every table has a summary. Every team’s data practices are documented as SOPs. Every access pattern is a retrievable resource. The LLM becomes a lookup mechanism for knowledge that was previously locked in human memory.
The Architecture: Don’t Monolith Your Agent #
The clearest insight from Meta’s design: they did not build one analytics agent. They built a multi-agent system with clear separation of concerns.
- Triage agent: understands the user’s intent
- Alternative-finder agent: knows all the non-restricted alternatives to the table you want
- Partial-preview agent: safely provides a small sample for exploration
- Access-request agent: drafts the formal permission request
- Owner agents: independently handle each team’s access approval workflow
Each agent is small, specialized, and testable. The complexity is in the coordination, not in any individual agent.
This maps directly to analytics: instead of one big “answer my analytics question” agent, build:
- A routing agent: classifies questions into security, performance, traffic, cost
- A security analyst agent: specializes in WAF, bot, DDoS data
- A traffic analyst agent: specializes in request volume, geography, path analysis
- An orchestrator: combines outputs
The Evaluation Flywheel #
Meta’s most important engineering decision: they built the evaluation first.
Before shipping, they curated a dataset of real requests with verified outcomes. They run evaluation daily. Every agent decision is logged. Analysts can review decisions and provide feedback. Feedback updates the evaluation set.
This is a flywheel: more usage → more feedback → better evaluations → better agent → more trust → more usage.
The implication for your data team: if you skip evaluation, you’re not building a production analytics agent. You’re building a demo.
The Guardrail Architecture #
Meta is explicit: LLMs cannot be trusted for risk decisions. They use:
- Rule-based risk computation (not LLM-based) as the final gate
- LLM for suggestion, rules for decision
- Audit logs for every agent action
- Budget limits (how much data can one query access per day)
For analytics agents, this translates to:
- LLM can suggest a WAF rule; human must approve it
- LLM can suggest a query; guardrail must validate it (no
COUNT(*)on sampled data) - Every agent query is logged with the reasoning chain
The Practical Takeaway #
Meta’s recipe, adapted for your data team:
- Represent your data as text resources — table summaries, field descriptions, usage patterns — and index them in a vector store
- Build specialized sub-agents rather than one monolithic agent
- Never trust the LLM for risk decisions — add rule-based guardrails
- Build evaluation before shipping — curate real questions with verified answers
- Log everything — every agent decision should be reviewable
This is not experimental. Meta shipped it to 70,000+ employees. The patterns work.