Meta’s Hyperscale Efficiency: How AI Agents Drive Performance at Scale
Meta’s AI agent platform automates performance optimization at hyperscale, recovering hundreds of megawatts by combining encoded expertise with automated detection and resolution of regressions and opportunities.
Meta’s Capacity Efficiency Program tackles the immense challenge of keeping performance high while minimizing power consumption across a hyperscale infrastructure serving billions. At its core is a unified AI agent platform that encodes the expertise of senior efficiency engineers into reusable, composable skills. These agents automate both the detection and resolution of performance issues, recovering hundreds of megawatts of power and shrinking what once took hours of manual investigation into mere minutes. This Q&A explores how the program works, the role of AI on both offense and defense, and where it’s heading.
What is Meta’s Capacity Efficiency Program?
The Capacity Efficiency Program is Meta’s systematic approach to optimizing performance across its global infrastructure. It aims to recover wasted power and compute resources by proactively finding opportunities for improvement (offense) and quickly catching and fixing regressions that degrade efficiency in production (defense). The program has already reclaimed hundreds of megawatts of power—enough to supply hundreds of thousands of homes for a year. To scale these efforts without proportionally increasing the engineering team, Meta built a unified AI agent platform that automates much of the investigative and remedial work, encoding deep domain knowledge into reusable, composable skills.

How do AI agents automate finding and fixing performance issues?
Meta’s AI agents are built on a platform that standardizes tool interfaces and encodes the expertise of senior efficiency engineers. These agents can autonomously diagnose performance regressions—compressing about 10 hours of manual investigation into roughly 30 minutes. They also fully automate the path from identifying an efficiency opportunity to generating a ready-to-review pull request. By composing reusable skills, the agents handle a growing volume of optimization wins that engineers would never have time to address manually. This automation lets the program scale megawatt delivery across more product areas without needing to proportionally increase headcount.
What is FBDetect and how does it help on the defense side?
FBDetect is Meta’s in-house regression detection tool that monitors production systems for performance drops. It catches thousands of regressions every week—each one a potential drain on power and efficiency. On the defense side, AI agents work with FBDetect to automate the root-cause analysis and mitigation of these regressions. Faster automated resolution means fewer wasted megawatts compound across the fleet. Without AI, engineers would have to manually investigate each regression, which becomes unsustainable at hyperscale. FBDetect combined with AI agents forms a crucial part of the self-sustaining efficiency engine Meta is building.
How do offense and defense strategies work together in this program?
Efficiency at hyperscale requires both proactive optimization (offense) and reactive regression management (defense). Offense involves searching for opportunities to make existing systems more efficient—like rethinking algorithms or reducing unnecessary computations—and deploying those changes. Defense uses tools like FBDetect to monitor resource usage in production and catch regressions that slip through. AI accelerates both: agents autonomously explore and implement offensive wins, and they quickly diagnose and fix defensive regressions. Together, these approaches ensure that Meta can continuously recover power and performance without needing engineers to spend hours on each investigation. The goal is a self-sustaining loop where AI handles the long tail of efficiency tasks.

What concrete impact has the program achieved so far?
The Capacity Efficiency Program has recovered hundreds of megawatts of power—equivalent to the annual electricity consumption of hundreds of thousands of American homes. AI agents compress about 10 hours of manual regression investigation into 30 minutes, and they fully automate the creation of ready-to-review pull requests for efficiency opportunities. This automation allows the program to deliver increasing amounts of power savings across a growing number of product areas without proportionally scaling the engineering team. The platform now handles a high volume of optimization wins that engineers would never have gotten to manually, significantly extending the reach of the efficiency team.
What is the future vision for AI-driven efficiency at Meta?
The end goal is a self-sustaining efficiency engine where AI agents handle the long tail of performance issues autonomously. This means continuously detecting, diagnosing, and fixing both regressions and optimization opportunities without human intervention for routine cases. Engineers will be freed to focus on innovation rather than manual investigation. The platform is already expanding to more product areas every half, and as the agents become more capable, they will take on increasingly complex efficiency challenges. Ultimately, Meta aims to keep growing its infrastructure efficiently while minimizing both power usage and the need to scale the efficiency team headcount.