Subham Sahoo @ssahoo_
Pioneering Diffusion LLMs | Team Lead @mbzuai - IFM | PhD @cornell s-sahoo.com San Francisco, CA Joined June 2010-
Tweets800
-
Followers4K
-
Following81
-
Likes2K
@iScienceLuvr Lol, same! I failed mine twice and got it on my third attempt only recently
📢June 29 (Mon): Nemotron-Labs-Diffusion: Unifying AR, Diffusion, and Self-Speculation 🌟Nemotron-Labs-Diffusion is a tri-mode language model (LM) that unifies AR, diffusion, and self-speculation decoding within a single architecture. 💪Trained with a joint AR-diffusion objective, Nemotron-Labs-Diffusion can switch modes to sustain high throughput across deployment settings and concurrency levels. Their study shows that 1️⃣AR and diffusion objectives are complementary: diffusion improves lookahead planning, while AR provides left-to-right linguistic priors. 2️⃣In self-speculation mode, diffusion drafts while AR verifies, outperforming multi-token prediction (MTP) methods in both acceptance rate and real-device efficiency. 3️⃣A speed-of-light analysis further demonstrates diffusion’s long-term potential, with up to 76.5% more tokens per forward pass than self-speculation under an optimal sampler. 📈Scaling to 3B, 8B, and 14B parameters, the Nemotron-Labs-Diffusion family, including base, instruct, and vision-language models, consistently outperforms state-of-the-art open-source AR and diffusion LMs in both accuracy and speed. ⚡For example, Nemotron-Labs-Diffusion-8B decodes 6× more tokens per forward than Qwen3-8B with comparable accuracy, translating to 4× higher throughput on SPEED-Bench with SGLang on a GB200 GPU. This Monday, Yonggan Fu (@YongganFu) from NVIDIA Research will present Nemotron-Labs-Diffusion: Unifying AR, Diffusion, and Self-Speculation.
Join us for three days of demos, talks, and surprises at ICML 2026 in Seoul! The IFM team is bringing an interactive experience with live K2‑series demos and Pan‑series world model previews to the expo floor. The Institute of Foundation Models is a global research lab, bringing our work to Seoul from our labs in Silicon Valley, Paris, and Abu Dhabi. Want to work with us? Scan the QR code to explore career opportunities. See you at booth B402! #IFM #ICML2026 #WorldModels #FoundationModels
@jxmnop @srush_nlp Normies like me would say “graduated with a PhD” but you do you
It’s a great team!
Now, imagine diffu..🤫🤐
what if you could instantly clone any object? Gemma 4 is now on @cerebras Inference, running up to 10x faster than GPUs (1,500 tokens/sec). Multimodal generations you can iterate on in real time :)
“Indian Software Engineer” 👹 intheweights.com
A technical dive inside our new "Midjourney Scanner"
@linqi_zhou I didn’t realize until now that you are at OpenAI, congrats to you too!
📢 June 15 (Mon): Uniform Diffusion Models Revisited: Leave-One-Out Denoiser and Absorbing State Reformulation 🤔 Discrete diffusion models are often trained through clean-data prediction, but the prediction can be used in different ways to define the reverse dynamics. In Masked Diffusion Models (MDM) these choices largely coincide, whereas in Uniform Diffusion Models (UDM) they do not. 💡 The authors show that the standard plug-in bridge parameterization for UDM is not optimized by the denoising posterior, but by a leave-one-out posterior that predicts each clean token without using its own noisy observation. This identifies a mismatch between the plug-in ELBO and the usual cross-entropy denoising objective. 🔧 The authors characterize the leave-one-out target and derive exact conversions between the denoiser, the leave-one-out posterior, and the score. These conversions allow them to disentangle parameterization and the training objective. 📈 Their results also lead to inference improvements without any additional training through an informed predictor-corrector sampler and improved temperature sampling based on the leave-one-out predictor. 🔧 The authors further introduce an absorbing-state reformulation of uniform diffusion that preserves the UDM joint law while decomposing it into masked-diffusion-like sampling operations, with simpler denoising posteriors, carry-over unmasking, and a natural remasking mechanism. 📈 On language modeling, leave-one-out parameterizations consistently improve UDM generation, while the absorbing construction matches or surpasses masked diffusion. These results suggest that the empirical gap between masked and uniform diffusion is driven less by the choice of marginals themselves than by parameterization and sampling design. This Monday, Samson Gourevitch (@samsongvch, samsongourevitch.github.io), Yazid Janati (@yjelid, yazidjanati.github.io), and Dario Shariatian (@dario_sha, darioshar.github.io) will present their paper "Uniform Diffusion Models Revisited: Leave-One-Out Denoiser and Absorbing State Reformulation".
Great work by @samsongvch @yjelid @dario_sha on Uniform-state Diffusion! Join us on Monday at 10 am PT
📢 June 15 (Mon): Uniform Diffusion Models Revisited: Leave-One-Out Denoiser and Absorbing State Reformulation 🤔 Discrete diffusion models are often trained through clean-data prediction, but the prediction can be used in different ways to define the reverse dynamics. In Masked
You have to be humble even when pursuing excellence. I think the arrogance with which Anthropic has pursued the latest release has universally landed poorly.
After DiffusionGemma dropped, views spiked on our @diffusion_llms Reading Group video: “The Diffusion Duality.” Video: youtu.be/FCO-nnqHOqQ?si… Join the Discord + mailing list: d-llms.com The diffusion duality: s-sahoo.com/duo/
Meet DiffusionGemma! An experimental open model that explores a fast approach to text generation, released under an Apache 2.0 license. Moving beyond sequential, token-by-token processes to generate entire blocks of text simultaneously. Here’s what’s new with DiffusionGemma: 👇
Want to understand what powers DiffusionGemma? Start with these two tutorials on its core building blocks: 1. Uniform-state Diffusion: youtu.be/FCO-nnqHOqQ?si… 2. Block diffusion: iclr.cc/virtual/2025/o…
Meet DiffusionGemma! An experimental open model that explores a fast approach to text generation, released under an Apache 2.0 license. Moving beyond sequential, token-by-token processes to generate entire blocks of text simultaneously. Here’s what’s new with DiffusionGemma: 👇
Curious what’s under the hood of DiffusionGemma? Begin with these two tutorials covering its foundational building blocks: 1. Uniform-state Diffusion: youtu.be/FCO-nnqHOqQ?si… 2. Block diffusion: iclr.cc/virtual/2025/o…
Taero Kim @Gold_Milkyway
32 Followers 122 Following Ph.D student @ Yonsei University Research Interest: OOD Generalization, Causality, Frontier Architecture of LLM, Efficient LLM
Marcel B. @marcel_butucea
2K Followers 2K Following System Design & Architecture ⛷️ #iRresearcher #agoraphobic #theWorldisYourOyster
这是啥 @lxyrougher
0 Followers 28 Following
この世界はARG(�... @ARG092532437977
207 Followers 3K Following
Agustino Thadeus @agustinothadeus
4 Followers 136 Following
Toni Sagayaraj @tonis_a_gayaraj
298 Followers 783 Following Systems Bio PhD student at @MoAlQuraishi @Columbia. 🏳️⚧️
John Doe @JohnDoeyvjs
1 Followers 286 Following
Weining Lin @WeiningLin011
7 Followers 85 Following PhD candidate of Computational Biology & Bioinformatics in UCL@CATH group. Deep learning for Protein design.
autodidac @autodidaclzfm
1 Followers 5K Following
astonishing_wolf @AstonishingWolf
4 Followers 256 Following
Ciência dos Dados @CinciadosDados1
38 Followers 2K Following AI Pro Expert - Formação Especialista em IA
Abhineet gupta @abhineetgupta24
15 Followers 282 Following
Natarajan Vaidhyanat @NVaidhyanat
5 Followers 381 Following
Niranjan C @niranjan_ai
8 Followers 383 Following
Georgios Batzolis @GBatz97
29 Followers 62 Following Postdoc at Cambridge working on diffusion models.
Benjamin Rozonoyer @rozonoyer96703
1 Followers 40 Following PhD student in ML & NLP @ UMass Amherst. Working on discrete diffusion models.
Gwangho Kim @Gimgwangho49306
1 Followers 37 Following
Peter Liang @HC_Liang
23 Followers 1K Following
shermineh @sherminehGhs
56 Followers 877 Following Final year PhD student in Computer Sceince focus on AI @TMU | Intern @Microsoft | LLMs Reasoning, RLVR, Agentic AI
Tanay @pursuitcurves
133 Followers 370 Following trying to max out all my life stats. xp shared with @Escapeplace__ blog: https://t.co/dT5TLub5r5
Luis Manrique @lluismanrique
431 Followers 2K Following a friend to autoregressive models. building https://t.co/myoYFiEmEC (YC S26)- prev member of technical staff @gumloop, AI/ML @instacart, @video_amp, @google
mos kim @kwanyoung_0
1 Followers 26 Following
Nehal Amin @NehalAmin03
10 Followers 1K Following Open-source AI & robotics enthusiast exploring the future of intelligence, space tech, internet systems, mobile computing, databases, and digital infrastructure
Andrew Bempah @KrumDonDada
25 Followers 4K Following Proud Kotobabian by way of Chicago & Stanford University
Lex Whalen @lxawha
10 Followers 64 Following Model Opt lead at SB Intuitions, Prev @NVIDIA, Georgia Tech
Christian T White @ChristianTWhit1
19 Followers 311 Following
Avihay Bar @AvihayBar
256 Followers 3K Following Software, Tech & Computer Graphics addict. Fluent in Python, Hand-waving and Shaders. Great people skills and a poor sense of humor. Views are my own.
Kevin 🇺🇸 Armstr... @armstrong_k
355 Followers 4K Following Husband, father, builder, and technologist. Synthesizing patterns and knowledge to solve interesting challenges.
Ryukijano 👾 @gyanateet
229 Followers 3K Following Likes data, structures , intelligence , vision and compute. Started as a gamer enjoying ML and RL, did some QC, moved to QC with ML and RL Grad@UniversityLeeds
Q Lee @qlee80
10 Followers 791 Following
Mohammad Niaz @Mohammad_Niaz94
106 Followers 1K Following PhD candidate w/ Clara Sanchez & @erikjbekkers. Interested in GenAI & Geometric Deep learning. 🇧🇩 (He/Him)
ѪՑխᶘⱮ @curieuseneus
108 Followers 7K Following
Goodnight @Mohammedarbi77
101 Followers 2K Following ML Nerd | @qdrant_engine Star | @Google DSC lead '23 Open to research & job opportunities
RB @theRachitBhatia
132 Followers 4K Following
Hoai @Hoai290401
25 Followers 1K Following
Peter Holderrieth @peholderrieth
5K Followers 561 Following CS PhD student at @MIT • Generative Modeling and AI4Science • Prev: Stats/Neuro @OxfordUni• Math at @UniBonn • Former: @AIatMeta
Sansa Gong @sansa19739319
602 Followers 365 Following Text Diffusion Models; PhD @hkunlp2020 Prev. @sjtu1896
Patrick Pynadath @PatrickPyn35903
203 Followers 279 Following Phd Student @purdue cs. working on making continuous gradients discrete
Dimitri von Rütte @dvruette
3K Followers 361 Following AI/ML research. prev. PhD @ETH_en, ML engineer @DeepJudgeAI
Julia Turc @juliarturc
24K Followers 705 Following Explaining AI on YouTube • YC S24 Founder • Ex-Google Research • Eastern-European nihilist & American optimist
Ravid Shwartz Ziv @ziv_ravid
12K Followers 4K Following AI researcher | Meta | NYU. Working on compression, representation learning, and memory. I have an AI podcast! https://t.co/Bzzp2OpwME
Xuezhe Ma (Max) @MaxMa1987
2K Followers 431 Following Research Lead @USC_ISI and Research Assistant Professor @CSatUSC PhD at CMU ML/NLP @LTIatCMU @CarnegieMellon
Sophia Tang @_sophia_tang_
2K Followers 130 Following CS + Stats @Penn M&T / Research in AI4Science and Generative Modeling / Writing at https://t.co/Z3ZLVrZeI6
Logan Kilpatrick @OfficialLoganK
338K Followers 3K Following Member of technical staff, working on Gemini, @GoogleAIStudio, the Gemini API, & Kaggle. My views!
Chieh-Hsin (Jesse) La... @JCJesseLai
3K Followers 489 Following 🇯🇵 Researcher @Sony | 🇹🇼 Visiting Asst. Prof. Applied Math, NYCU | 🇺🇸 PhD. Math UMN | ACs @NeurIPSConf, @icmlconf, @iclr_conf | AE @TmlrOrg | IEEE MLSP TC
Aran Komatsuzaki @arankomatsuzaki
181K Followers 374 Following Sharing AI research. Early work on AI (GPT-J, scaling, MoE). Ex ML PhD (GT) & Google.
Beff (e/acc) @beffjezos
244K Followers 3K Following founder @ e/acc // thermo king @extropic // Kardashev scaling is all you need
Zachary Horvitz @zachary_horvitz
605 Followers 903 Following CS PhD student at @Columbia. Prev @RadAI, @BrownUniversity
Georg Martius @GMartius
2K Followers 207 Following Researcher, interested in autonomous machine learning, reinforcement learning, robotics, 3d printing and more
amit @gravicle
13K Followers 5K Following you're not talking to someone who woke up a loser ceo @LumaLabsAI working on multimodal AGI | prev: built Vision Pro at | everything is figureoutable
Karsten Kreis @karsten_kreis
3K Followers 799 Following Principal Research Scientist at @NVIDIA | Former Physicist | Deep Generative Learning | Flows and Diffusion | Proteins and Molecules Opinions are my own.
Giannis Daras @giannis_daras
5K Followers 622 Following MIT CSAIL Postdoc 👨🎓 Ph.D. Computer Science @UTAustin 👨💻 Ex: @nvidia, @google, @explosion_ai, @ntua
Teortaxes▶️ (Deep... @teortaxesTex
68K Followers 3K Following We're in a race. It's not USA vs China but humans and AGIs vs ape power centralization. @deepseek_ai stan #1, 2023–Deep Time «C’est la guerre.» ®1
LLM360 @llm360
3K Followers 76 Following LLM360 is an open research lab enabling community-owned AGI through open-source large model research and development.
eigenron @eigenron
15K Followers 2K Following founding @AntimLabs; prev: physics/math, superconducting qubits
Joshua Achiam @jachiam0
28K Followers 1K Following Freedom, flourishing, and abundance. Chief Futurist @openai. Main author of https://t.co/cKuSh21yaz
Alon Turing @chaumian
6K Followers 688 Following deceased gay man. invented compute then killed myself
Raghav Singhal @_rk_singhal
344 Followers 1K Following Thinking about diffusion models and ML for health. ML PhD at @NYU_Courant. Prev: @MSFTResearch, @RadAI. not on Forbes 30 under 30
John Thickstun @jwthickstun
2K Followers 650 Following Assistant Professor @Cornell_CS. Previously @StanfordCRFM @stanfordnlp @uwcse Controllable Generative Models. AI for Music.
Zhihan Yang @zhihanyang_
657 Followers 228 Following ML PhD Student @Cornell_CS in Generative Models. Intern @NVIDIA GenAIR. Prev Intern @IFM_MBZUAI. Co-host @diffusion_llms.
nie shen @nie_shen94817
14 Followers 35 Following
Siyan Zhao @siyan_zhao
3K Followers 800 Following CS PhD @UCLA | prev intern @AIatMeta, @Amazon | interested in RL, diffusion LLMs | bachelors @uoft
Caglar Gulcehre @caglarml
5K Followers 1K Following Scientist MAI, Prof @ EPFL, Ex-research scientist @ DeepMind, MSR, IBM Socials: https://t.co/qYmpvyUf0u, https://t.co/LZ5sWt7Asj
Arash Vahdat @ArashVahdat
11K Followers 912 Following Research Director, leading fundamental generative AI research (GenAIR) @nvidia research, volunteer at California Search & Rescue, views are my own.
Jaeyeon (Jay) Kim @Jaeyeon_Kim_0
945 Followers 372 Following Ph.D. student in CS @Harvard (currently interning at @tesla)
Luming Tang @lt453_
2K Followers 3K Following Senior Research Scientist @GoogleDeepMind, core contributor of Gemini Pretraining and Omni Post-training; Prev: PhD @CornellCIS, BS @Tsinghua_Uni
Prachi Badarayani @ShachiDeshpande
46 Followers 141 Following Applied Scientist@Microsoft. PhD from Cornell Tech, NYC. Interested in Causal Inference and Machine Learning
Jeremy Howard @jeremyphoward
320K Followers 7K Following 🇦🇺 Co-founder: @AnswerDotAI/@FastDotAI ; Prev: Professor@UQ; @kaggle founding president; founder @fastmail/@enlitic/… https://t.co/16UBFTX7mo
Linqi (Alex) Zhou @linqi_zhou
1K Followers 344 Following Research Scientist @openai. Prev: research scientist @LumaLabsAI, co-founder @apparatelabs (acq.), Ph.D. at Stanford University.
Guanghan Wang @Guanghan__Wang
319 Followers 307 Following Second-year CS PhD student @Cornell @Cornell_Tech; Prev undergrad @Tsinghua_Uni; Working on diffusion language models. email: [email protected]
Joey Bose @bose_joey
4K Followers 288 Following Assistant Professor @imperialcollege and @Mila_Quebec Affiliate member. Into Geometry ⋃ Generative Models ⋃ AI4Science. Ex-@UniofOxford, @Mila_Quebec, @UofT.
Yueying Li @lisali126
227 Followers 569 Following Ph.D. @Cornell Start new adventures at @MITCSAIL soon. Former SJTU @Umich @Apple Intel Labs @MSFTResearch
Simon Guo @simonguozirui
4K Followers 6K Following Beep Boop @thinkymachines | CS PhD student @Stanford | 🎓 @Berkeley_EECS
Jiaxin Shi @thjashin
5K Followers 361 Following Research @Meta MSL TBD | past @GoogleDeepMind @Stanford @MSRNE @VectorInst @RIKEN_AIP_EN @Tsinghua_Uni. Building probabilistic & algorithmic models for learning
Pranam Chatterjee @pranamanam
5K Followers 375 Following Generating biologics to program biology! 💻🧫 Assistant Professor at @Penn | Co-Founder @GametoGen and @UbiquiTxINC | Formerly @MIT '16 '18 '20 and @harvardmed
Wasu Top Piriyakulkij @topwasu
248 Followers 208 Following CS PhD student at @Cornell | Prev: Student Researcher @GoogleDeepMind

























