Suvinay Subramanian
Home ·
Work ·
Publications ·
Interests ·
Podcast
Planned Diffusion
Daniel Israel, Tian Jin, Ellie Cheng, Guy Van den Broeck, Aditya Grover,
Suvinay Subramanian, Michael Carbin
ICLR 2026
Web site
| X (Twitter) Thread
Characterizing VLA Models: Identifying the Action
Generation Bottleneck for Edge AI Architectures
Manoj Vishwanathan, Suvinay Subramanian, Anand Raghunathan
CoDAIM Workshop, ASPLOS 2026
Spark Transformer: Reactivating Sparsity in FFN and Attention
Chong You, Kan Wu, Zhipeng Jia, Lin Chen, Srinadh Bhojanapalli, Jiaxian
Guo, Utku Evci, Jan Wassenberg, Praneeth Netrapalli, Jeremiah J. Willcock,
Suvinay Subramanian, Felix Chern, Alek Andreev, Shreya Pathak, Felix Yu,
Prateek Jain, David E. Culler, Henry M. Levy, Sanjiv Kumar
NeurIPS 2025
FG-Attn: Leveraging Fine-Grained Sparsity In Diffusion Transformers
Sankeerth Durvasula, Kavya Sreedhar, Zain Moustafa, Suraj Kothawade, Ashish
Gondimalla, Suvinay Subramanian, Narges Shahidi, Nandita Vijaykumar
arXiv , September 2025
Learning to Keep a Promise: Scaling Language Model
Decoding Parallelism with Learned Asynchronous Decoding
Tian Jin, Ellie Y. Cheng, Zack Ankner, Nikunj Saunshi, Blake M.
Elias, Amir Yazdanbakhsh, Jonathan Ragan-Kelley,
Suvinay Subramanian, Michael Carbin
ICML 2025
[X (Twitter) Thread ]
RAGO: Systematic Performance Optimization
for Retrieval-Augmented Generation Serving
Wenqi Jiang, Suvinay Subramanian, Cat Graves, Gustavo Alonso,
Amir Yazdanbakhsh, Vidushi Dadu
ISCA 2025
Leveraging LLMs to Improve Hardware-Software
Co-Design Workflow Productivity and Accessibility
Kavya Sreedhar, Josh Ogbonda, Pengqi Yin, Narges Shahidi, Kanthi Nagaraj,
Zhijie Deng, Rami Cohen, Ton Kalker, Sameer Kumar, Amir Yazdanbakhsh,
Suvinay Subramanian
ML for Computer Architecture and Systems (MLArchSys) Workshop, ISCA 2025
MIST: A Co-Design Framework for Heterogeneous,
Multi-Stage LLM Inference
Abhimanyu Rajeshkumar Bambhaniya, Hanjiang Wu, Suvinay Subramanian,
Sudarshan Srinivasan, Souvik Kundu, Amir Yazdanbakhsh, Midhilesh
Elavazhagan, Madhu Kumar, Minlan Yu, Arijit Raychowdhury, Tushar Krishna
arXiv
Effective Interplay
between Sparsity and Quantization: From Theory to Practice
Simla Burcu Harma, Ayan Chakraborty, Elizaveta Kostenok, Danila
Mishin, Dongho Ha, Babak Falsafi, Martin Jaggi, Ming Liu, Yunho
Oh, Suvinay Subramanian, Amir Yazdanbakhsh
ICLR 2025 (Spotlight)
The Journey Matters:
Average Parameter Count over Pre-training Unifies Sparse and
Dense Scaling Laws
Tian Jin, Ahmed Imtiaz Humayun, Utku Evci, Suvinay Subramanian,
Amir Yazdanbakhsh, Dan Alistarh, Gintare Karolina Dziugaite
ICLR 2025
Progressive Gradient
Flow for Robust N:M Sparsity Training in Transformers
Abhimanyu Rajeshkumar Bambhaniya, Amir Yazdanbakhsh, Suvinay
Subramanian, Sheng-Chun Kao, Shivani Agrawal, Utku Evci, Tushar
Krishna
CPAL 2025
Demystifying AI Platform Design for Distributed Inference of
Next-Generation LLM Models
Abhimanyu Bambhaniya, Ritik Raj, Geonhwa Jeong, Souvik Kundu, Sudarshan
Srinivasan, Suvinay Subramanian, Midhilesh Elavazhagan, Madhu Kumar,
Tushar Krishna
arXiv , June 2024
TPU v4: An Optically Reconfigurable Supercomputer for Machine
Learning with Hardware Support for Embeddings
Norman P. Jouppi, George Kurian, Sheng Li, Peter Ma, Rahul Nagarajan, Lifeng Nai,
Nishant Patil, Suvinay Subramanian, Andy Swing, Brian Towles, Cliff Young, Xiang Zhou,
Zongwei Zhou, David Patterson
ISCA 2023 (Industry Track)
FLAT: An Optimized Dataflow for Mitigating
Attention Bottlenecks
Sheng-Chun Kao, Suvinay Subramanian, Gaurav Agrawal, Amir Yazdanbakhsh, Tushar Krishna
ASPLOS 2023
STEP: Learning N:M Structured Sparsity Masks from
Scratch with Precondition
Yucheng Lu, Shivani Agrawal, Suvinay Subramanian, Oleg Rybakov,
Christopher De Sa, Amir Yazdanbakhsh
ICML 2023
Training Recipe for
N:M Structured Sparsity with Decaying Pruning Mask
Sheng-Chun Kao, Amir Yazdanbakhsh, Suvinay Subramanian, Shivani Agrawal,
Utku Evci, Tushar Krishna
SNN 2022
Harmonizing Speculative
and Non-Speculative Execution in Architectures for Ordered Parallelism
Mark C. Jeffrey, Suvinay Subramanian, Victor A. Ying, Hyun Ryong Lee, Joel Emer, Daniel Sanchez
MICRO 2018
SAM: Optimizing Multithreaded Cores for Speculative Parallelism
Maleen Abeydeera, Suvinay Subramanian, Mark C. Jeffrey, Joel Emer, Daniel Sanchez
PACT 2017
Fractal: An Execution Model
for Fine-Grain Nested Speculative Parallelism
Suvinay Subramanian, Mark C. Jeffrey, Maleen Abeydeera, Hyun
Ryong Lee, Victor A. Ying, Joel Emer, Daniel Sanchez
ISCA 2017
[Slides ]
MIT News Article ,
Hacker News Discussion
Data-Centric Execution of Speculative Parallel Programs
Mark C. Jeffrey, Suvinay Subramanian, Maleen Abeydeera, Joel Emer, Daniel Sanchez
MICRO 2016
Programmable Packet Scheduling at Line Rate
Anirudh Sivaraman, Suvinay Subramanian, Anurag Agrawal, Sharad Chole, Shang-Tse Chuang, Tom Edsall, Mohammad Alizadeh, Sachin Katti, Nick McKeown, Hari Balakrishnan
SIGCOMM 2016
Web site
Unlocking Ordered Parallelism with the Swarm Architecture
Mark C. Jeffrey, Suvinay Subramanian, Cong Yan, Joel Emer, Daniel Sanchez
IEEE Micro 2016
Top Picks from the Computer Architecture Conferences
MIT News Article ,
EE Journal Article
A Scalable Architecture for Ordered Parallelism
Mark C. Jeffrey, Suvinay Subramanian, Cong Yan, Joel Emer, Daniel Sanchez
MICRO 2015
Selected for IEEE Micro's Top Picks special issue of "most significant papers in computer architecture
based on novelty and long-term impact" from 2015
Towards Programmable Packet Scheduling
Anirudh Sivaraman, Suvinay Subramanian, Anurag Agrawal, Sharad Chole, Shang-Tse Chuang, Tom Edsall, Mohammad Alizadeh, Sachin Katti, Nick McKeown, Hari Balakrishnan
HotNets 2015
SCORPIO: A 36-core Research Chip Demonstrating Snoopy Coherence on a Scalable Mesh NoC with In-Network Ordering
Chia-Hsin Owen Chen, Sunghyun Park, Suvinay Subramanian, Tushar Krishna, Bhavya K. Daya, Woo-Cheol Kwon, Brett Wilkerson, John Arends, Anantha P. Chandrakasan, Li-Shiuan Peh
HotChips 2014
SCORPIO: A 36-core Research Chip Demonstrating Snoopy Coherence on a Scalable Mesh NoC with In-Network Ordering
Bhavya K. Daya, Chia-Hsin Owen Chen, Suvinay Subramanian, Woo-Cheol Kwon, Sunghyun Park, Tushar Krishna, Anantha P. Chandrakasan, Li-Shiuan Peh
ISCA 2014
MIT News Article
No Silver Bullet: Extending SDN to the Data Plane
Anirudh Sivaraman, Keith Winstein, Suvinay Subramanian, Hari Balakrishnan
HotNets 2013
[Slides ]
Selected for the final round of the Qualcomm Innovation Fellowship
Web site
Single-Cycle Multihop Asynchronous Repeated Traversal: A SMART Future for Reconfigurable On-Chip Networks
Tushar Krishna, Chia-Hsin Owen Chen, Sunghyun Park, Woo-Cheol Kwon, Suvinay Subramanian, Anantha P. Chandrakasan, Li-Shiuan Peh
IEEE Computer, October 2013
SMART: A Single-Cycle Reconfigurable NoC for SoC Applications
Chia-Hsin Owen Chen, Sunghyun Park, Tushar Krishna, Suvinay Subramanian, Anantha P. Chandrakasan, Li-Shiuan Peh
DATE 2013
Theses