The Culture of Compute: Why AI Labs Waste Resources and How to Fix It

The AI landscape is booming, with numerous labs boasting ample funding and computational power. Yet, many struggle to deliver tangible results, leading to talent drain and a fraying organizational culture. Anshumita, CEO and founder of Amp, argues that the root cause is a misalignment between stated missions and actual actions, a problem that extends to the very infrastructure powering AI development.

Why AI Compute Is Being Wasted

A key indicator of inefficiency in AI compute lies in its utilization metrics. Node utilization, which measures the percentage of GPUs in a data center that are actively used, should ideally be around 95%. At Google, for instance, 95% utilization is considered an outage, with 96% being the standard. Many single-down clusters, however, fall short of this benchmark.

Beyond node utilization, MFU (Machine Utilization) should ideally be between 60% and 70% for best-in-class performance. The discrepancy often stems from a leadership and alignment issue. When the entities funding and deploying compute clusters are not in sync with those managing and measuring their output, wastage compounds rapidly. This is exacerbated by the rapid scaling demanded in the AI space, where iterative, common-sense approaches are often abandoned for speed.

"AI scaling doesn't change the In fact, if anything, AI scaling should be putting a premium on the value of common sense and infrastructure because the margin of error now is so much lower and the cost of wastage are so much higher," Anshumita emphasizes. While AI capabilities are indeed novel, this doesn't excuse a disregard for fundamental infrastructure principles. The "move fast, break things" mantra of early tech companies needs to evolve into "move fast with responsible infrastructure."

Responsible Infrastructure and Data Center Backlash

The rapid expansion of data centers, crucial for AI compute, is increasingly facing community backlash. Scott Nolan, founder of General Matter, proposed a solution: allocate a portion of the marginal cost of compute per hour to the local community. For example, a $4/hour compute cost could be increased to $4.50, with the extra 50 cents directly benefiting the community. This approach would foster clear public benefit and enhance the perceived reliability of the compute.

Currently, up to 20% of data centers in the US are at risk of not securing community support, not just due to concerns about jobs, but also about power grid impact, environmental considerations, and permitting. A shift towards offering tangible benefits, such as reduced electricity costs for the community, could transform data centers from potential liabilities into welcomed partners. Failure to address these concerns could lead to regulatory scrutiny for those prioritizing speed over responsible development.

Anshumita advocates for partnering with established, reliable data center providers with long-term track records, rather than solely relying on newer "neo-clouds." These seasoned providers understand infrastructure management and have weathered economic cycles, offering a stability that is crucial for the long-term health of the AI ecosystem.

AMP Grid: Making FLOPs Flow Like Megawatts

Amp is building a "compute grid" with the ambitious goal of making FLOPs (floating-point operations per second) flow as seamlessly as megawatts. This involves a pooling and utilization layer across multiple clouds and silicon providers, aiming to address the current fragmentation and lack of fungibility in compute resources.

"We see ourselves as what's called an independent system operator," Anshumita explains, drawing a parallel to the historical development of the electric grid. Just as independent system operators (ISOs) coordinated power generation and transmission without owning the assets themselves, Amp aims to be a neutral coordinator for compute resources. This model has historically proven to be the most enduring, relying on long-term demand anchors from various industries to ensure base load capacity and flexible utilization.

The technical implementation began at the scheduling layer, leveraging expertise from Google's Borg X Borg GQM scheduler. The core principles revolve around abstraction and composition, bundling and unbundling, and horizontalization. Amp's grid aims to provide a guaranteed base load for its partners while allowing for flexible, on-demand spikes in compute capacity.

This approach mirrors the concept of interruptible demand, where jobs can be dynamically prioritized based on a credit system or bidding mechanism. While this system has proven effective internally at Google, Anshumita notes that its limitations might have contributed to Google missing out on opportunities like GPT.

Foundry, Frontier Labs, and Research Hoarding

Amp operates as a holding company with an infrastructure business (Amp Grid) and a capital business called Foundry. Foundry incubates and invests in new frontier AI labs, recognizing that significant research progress often occurs within existing large organizations. However, Anshumita points out a critical issue: the hoarding of research.

"So much of that has never seen the light of day," she laments regarding research from labs like DeepMind. Even when papers are published, there can be significant embargo periods, leading to an "adverse selection problem" where only less impactful research is publicly shared. This hoarding of knowledge has negative externalities, representing a market failure that needs to be addressed.

Gigawatt-Scale Compute and End-of-Life Prediction

Amp's ambition extends to securing a base load pool of 1.3 gigawatts of compute capacity, with an estimated need for 6 gigawatts over the next four years to support the advancement of frontier research. This compute is crucial for a wide range of applications, including scientific discovery and, notably, end-of-life prediction in healthcare.

Anshumita's personal journey into this field began during her graduate studies in bioinformatics at Stanford. She was struck by the cultural differences in approaching death, particularly the Western tendency to view it as a terminal event to be delayed at all costs. This often leads to aggressive, costly, and quality-of-life-diminishing end-of-life care, consuming a significant portion of healthcare spending.

The core problem, she explains, is the lack of precise prognostication. Physicians, fearing medical malpractice, often provide broad timeframes, leaving patients in uncertainty. An AI system capable of making more precise end-of-life predictions could empower patients to make informed decisions about their remaining time, leading to better palliative care and reduced healthcare costs.

While the technology for such AI systems exists, regulatory hurdles remain. The challenge lies in shifting the burden of clinical diagnosis from physicians to AI. Anshumita is now focused on incubating solutions to this problem, advocating for patient empowerment through AI-driven end-of-life prediction. This mission, alongside the development of "net positive data centers," forms the core of Amp's public benefit corporation structure.

Frontier Systems, Output Maxing, and Alignment

The discipline Anshumita is exploring, variously termed "frontier systems" or "frontier labs," can be encapsulated as "output maxing." It's about maximizing the effective use of available resources, whether it's GPUs or healthcare spending. This philosophy extends to AI development, where efficiency and nuanced approaches are paramount, even with the advent of powerful models like GPT.

"I think there's a philosophy I think we all owe it to ourselves to do output maxing with a new capability called AI on a global level," she states. From an engineering perspective, this translates to a focus on alignment. Full-stack alignment within an organization or system is incredibly difficult but yields immense benefits.

The challenge lies in scaling without losing this alignment. This can be achieved through standardization of protocols and API specifications that allow for lossless communication, or by developing entirely new capabilities that unlock such abundance that standardization becomes less critical.

Compute Markets, SF Compute, and Non-NVIDIA Chips

The burgeoning demand for compute has led to efforts like SF Compute, which aims to standardize futures contracts for compute. Anshumita sees such initiatives as crucial for accelerating the development of compute markets by injecting demand or supply shocks. Amp's grid aims to be a two-way protocol, allowing entities like SF Compute to seamlessly connect for either demand or supply.

The current compute market is characterized by explosive demand, with excess capacity quickly disappearing. This scarcity is driving innovation in alternative chip architectures beyond NVIDIA. Companies like Matrox are designing chips that adhere to NVIDIA's reference architecture, allowing them to plug into existing infrastructure and focus their innovation on systems co-design.

"The compute demand is so high. Like I don't I think Nvidia's not able to meet the demand of production. So we just need more chips," Anshumita observes. This collaborative approach, where companies leverage existing standards to innovate in specific areas, is seen as healthy for unlocking new bottlenecks.

Trust Boundaries, Co-Design, and Researcher CEOs

For chip teams to excel in co-design, they require early visibility into future model generations, as chip development takes years. When operating within a large organization like Google, this visibility is facilitated by tight trust boundaries. However, leaving this boundary as a startup introduces significant risk. Anshumita's work involves helping chip teams gain access to trust within the independent ecosystem.

She also champions the idea of "researcher CEOs," individuals who possess both deep scientific expertise and the leadership capabilities to drive innovation. She argues that the rigorous discipline required to excel in scientific research translates directly to the performance needed for great leadership. "Being a good CEO is hard. Being a great CEO actually requires a level of performance that scientists who have already published at the top of their field have accomplished," she asserts.

Rishi Valley, Singapore, and Money as a Measure

Anshumita's personal journey, from a spartan boarding school in India to a scholarship program in Singapore, has instilled in her a deep appreciation for resourcefulness and a nuanced view of money. Her upbringing at Rishi Valley, a school emphasizing simplicity and discipline, and her time in a cramped dorm in Singapore, have shaped her perspective.

"I don't need much to be happy in life," she reflects. For her, money is not an end in itself but a tool for pursuing meaningful missions. This perspective contrasts with the "mercenary" aspects of Silicon Valley, where large sums of money can sometimes lead individuals to lose sight of their core principles.

Closing Thoughts

The conversation highlights the critical importance of culture, alignment, and first-principles thinking in the AI era. As the field rapidly advances, the ability to manage resources efficiently, foster strong organizational cultures, and maintain a clear mission will be paramount for success. The challenges of compute scarcity, research hoarding, and the ethical implications of AI demand a thoughtful and disciplined approach, moving beyond mere competition to a focus on genuine leadership and impact.