Keynote Speaker
Esha Choukse
Principal Researcher, Azure Research — Systems (AzRS), Microsoft
From Models to Systems: Scaling Agentic and Multimodal AI through Cross-Stack Co-Optimization
Abstract
Recent advances in AI agents and multimodal foundation models are reshaping how intelligent systems perceive, reason, and act. Yet translating these capabilities into practical deployments, particularly in real-time, interactive, and resource-constrained environments requires rethinking the traditional boundaries between machine learning models and computer systems. In this talk, I will present a cross-layer perspective on building next-generation AI systems, drawing on our recent work spanning scalable and reliable agentic workflows as well as real-time multimodal generation. Across these efforts, a recurring theme emerges: system performance is no longer determined solely by hardware efficiency or model quality in isolation, but by joint optimization across models, runtimes, compilers, and hardware platforms.
I argue that the next frontier in systems design lies in treating accuracy itself as a first-class systems knobāone that can be dynamically traded for latency, throughput, cost, and energy in principled ways. This shift requires new abstractions for reasoning about uncertainty, adaptive execution strategies, and scheduling mechanisms that co-optimize model behavior and system resources. I will highlight key challenges and opportunities at this intersection for the systems community.
Bio
Esha Choukse is a Principal Researcher in the Azure Research — Systems (AzRS) group at Microsoft. Her research focuses on efficient and sustainable AI across the computing stack, spanning AI platforms, hardware, and datacenter-scale infrastructure. She is a recipient of the ACM SIGMICRO Early Career Award for foundational contributions to hardware memory compression and to sustainable and efficient datacenter systems. Her papers have received three IEEE Micro Top Picks and an HPCA Best Paper Award. Several of her projects, including Splitwise and power stabilization in AI training datacenters, have had far-reaching impact on the research community and are deployed broadly across industry. Esha received her Ph.D. from The University of Texas at Austin in 2019 and has published extensively in leading venues including ISCA, ASPLOS, MICRO, HPCA, NSDI, and SC.
Schedule TBD
The detailed workshop schedule will be announced after paper notifications.
The workshop will feature:
- Invited keynote presentations
- Contributed paper talks (15 min presentation + 5 min Q&A)
Post-Workshop
A post-workshop summary paper capturing insights and challenges discussed during the event will be released publicly on arXiv or ACM Digital Library for open access.