Dylan Sam @dylanjsam
safety research @openai | formerly phd @mldcmu, @BrownUniversity dsam99.github.io San Francisco Joined October 2017-
Tweets209
-
Followers1K
-
Following521
-
Likes3K
Thanks for having me! I talked about our work on valid inference with synthetic data (arxiv.org/abs/2508.06635) and robust human-AI complementarity (ICML 2026, paper up soon), both with my PhD student @yewonbyun_
New OpenAI post: Can midtraining on docs about aligned AI bake in alignment priors for agents? We report an experiment where those priors are quickly washed away by RL and fail to generalize to agentic settings. But that cuts both ways: priors that AIs are misaligned fade too!
I'm also extremely excited for our companion post today on Model Spec Evals! Spec Evals are a new way we're measuring progress towards alignment with the Model Spec — including public results, an open dataset, and code others can build on. alignment.openai.com/model-spec-eva…
As AI agents access more untrusted information with greater autonomy, prompt injections may become the greatest security challenge of our era. @GraySwanAI, in collaboration many frontier labs, just released our paper on the largest public prompt injection challenge to date. 🧵
Your AI agent can be hijacked by a prompt injection and you'd never know! The attack executes. The response looks normal. And the user moves on. We ran the largest public competition testing this exact threat across tool use, coding, and computer use agents. 464 participants,
1/🧵 We are very excited to release our new paper! From Entropy to Epiplexity: Rethinking Information for Computationally Bounded Intelligence arxiv.org/abs/2601.03220 with amazing team @ShikaiQiu @yidingjiang @Pavel_Izmailov @zicokolter @andrewgwils
Finally, I'm presenting work on monitoring models for harmful behaviors, hallucinations, and adversarial manipulation at Poster #1304 in Exhibit Hall C,D,E on 12/5 at 4:30pm! x.com/dylanjsam/stat…
To trust LLMs in deployment (e.g., agentic frameworks or for generating synthetic data), we should predict how well they will perform. Our paper shows that we can do this by simply asking black-box models multiple follow-up questions! w/ @m_finzi and @zicokolter 1/ 🧵
Next, I'm presenting on safety pretraining, where we find that incorporating safety behaviors during pretraining leads to more robust language models! Come by Poster #5210 at Exhibit Hall C,D,E at 4:30pm today (12/4)! x.com/dylanjsam/stat…
🚨Excited to introduce a major development in building safer language models: Safety Pretraining! Instead of post-hoc alignment, we take a step back and embed safety directly into pretraining. 🧵(1/n)
I'm at NeurIPS this week! Excited to meet old/new friends and chat with people about training safer language models. I'm presenting a few works on safety pretraining, measuring diversity in data curation, and monitoring model behaviors --- more info below 👇
I’m at NeurIPS this week (12/2-12/8) to present our work on when/how synthetic data (e.g., LLM simulations) can help scientists make inferences with less real data, improving the efficiency of costly experiments. Come by Poster #904 on Thursday 4:30PM (Exhibit Hall C,D,E)!🙂
💡Can we trust synthetic data for statistical inference? We show that synthetic data (e.g. LLM simulations) can significantly improve the performance of inference tasks. The key intuition lies in the interactions between the moments of synthetic data and those of real data
Excited about our NeurIPS'25 tutorial Data Privacy, Memorization & Copyright in GenAI with Cooper (co-founder, GenLaw) & Joe (represents OpenAI, Stability in all US copyright litigations) We bring together ML researchers, with those who understand its legal implications. Pls RT
I gave talks at MIT and Harvard this week about "Science with synthetic data". How can generative models help us learn about the world (e.g., social systems) in a principled way? Lots of interesting conversations; more convinced than ever that there's nuanced issues to navigate
📢 Multi-token prediction has long struggled with defining the right “auxiliary target,” leading to tons of heuristics. We show a core limitation of these and propose a simple & sweet idea: future summary prediction. Introducing what I call 🚀TL;DR token pretraining🚀
[1/9] While pretraining data might be hitting a wall, novel methods for modeling it are just getting started! We introduce future summary prediction (FSP), where the model predicts future sequence embeddings to reduce teacher forcing & shortcut learning. 📌Predict a learned
🤖 Robots rarely see the true world's state—they operate on partial, noisy visual observations. How should we design algorithms under this partial observability? Should we decide (end-to-end RL) or distill (from a privileged expert)? We study this trade-off in locomotion. 🧵(1/n)
How can synthetic data from LLMs be used, e.g. for social science, in a principled way? Check out Emily's thread on our NeurIPS paper. The key is to generate each synthetic sample by prompting with a real example -- enables debiased estimates that wouldn't be possible otherwise!
💡Can we trust synthetic data for statistical inference? We show that synthetic data (e.g. LLM simulations) can significantly improve the performance of inference tasks. The key intuition lies in the interactions between the moments of synthetic data and those of real data
14/ I’ll be giving a talk on our work at the #COLM2025 Social Simulations workshop tomorrow (Friday 10/10) at 10AM. Come by Room 523AB!🙂 Paper Link: arxiv.org/abs/2508.06635 Code: github.com/lasilab/valid-…
13/ I really enjoyed working on this project with the brilliant and kindest @shantanug7 and great mentors @zacharylipton, @DonskerClass and @brwilder
💡Can we trust synthetic data for statistical inference? We show that synthetic data (e.g. LLM simulations) can significantly improve the performance of inference tasks. The key intuition lies in the interactions between the moments of synthetic data and those of real data
Zachary Lipton @zacharylipton
68K Followers 2K Following Professor: CMU/@acmi_lab, Cofounder: @AbridgeHQ, Creator: @d2l_ai & https://t.co/QQt98VNLUp, Relapsing 🎷
Pratyush Maini @pratyushmaini
3K Followers 572 Following Data Quality x Memorization | Founding Team @datologyai | PhD @mldcmu | BTech @iitdelhi
Jeremy Cohen @deepcohen
6K Followers 998 Following Research fellow at Flatiron Institute, working on understanding optimization in deep learning. Previously: PhD in machine learning at Carnegie Mellon.
Valerie Chen ✈️ I... @valeriechen_
2K Followers 550 Following phd student @mldcmu @SCSatCMU + intern @OpenHandsDev + co-creator of @CopilotArena
Kayo Yin @kayo_yin
16K Followers 720 Following PhD student @berkeley_ai. AI persuasion, safety, sign language. Prev @carnegiemellon @polytechnique, intern @msftresearch @deepmind. 🇫🇷🇯🇵
Nicholas Roberts @nick11roberts
1K Followers 2K Following Ph.D. student @WisconsinCS. Working on foundation models and breaking past scaling laws. Previously CMU @mldcmu, UCSD @ucsd_cse, FCC @fresnocity. 🤔🤨🧐 e/hmm
Yiding Jiang @yidingjiang
2K Followers 638 Following Research @GoogleDeepMind | Prev: PhD @mldcmu, AI resident @GoogleAI, BS @Berkeley_EECS. Trying to understand stuff.
Christina Baek @_christinabaek
2K Followers 677 Following research @openai // previously phd @mldcmu
Zico Kolter @zicokolter
28K Followers 753 Following Professor and Head of Machine Learning Department at @CarnegieMellon. Board member @OpenAI and @Qualcomm. Chief Scientist @GraySwanAI.
Lucio Dery Jnr Mwinm @derylucio
617 Followers 993 Following
Sara Hooker @sarahookr
62K Followers 11K Following Building intelligence that evolves @adaption_ai. Built @Cohere_Labs, @GoogleBrain, @GoogleDeepmind. ML Efficiency, Multimodal\lingual.
Saurabh Garg @saurabh_garg67
2K Followers 677 Following @thinkymachines | prev/ Researcher @MistralAI; PhD @mldcmu; CS @iitbombay (undergrad);
Paul Liang @pliang279
9K Followers 462 Following Assistant Professor MIT @medialab @MITEECS @nlp_mit || Foundations of self-evolving multisensory AI to enhance the human experience.
Aidan Yang @AidanZHYang
634 Followers 541 Following PhD student @CarnegieMellon || Previously @awsCloud, @MSFTResearch, @AMD and @Queensu 🇨🇦 || Researching software engineering, security, and deep learning
Niloofar ✈️ icml @niloofar_mire
10K Followers 2K Following Technical staff @humansand, incoming asst. prof @LTIatCMU @CMU_EPP, ex RS in @AIatMeta, postdoc @uwcse, Ph.D. @ucsd_cse, former @MSFTResearch -Privacy, ML, NLP
Sachin Goyal @goyalsachin007
2K Followers 729 Following Pretraining @ Anthropic | Past: PhD @ CMU MLD, intern at Meta, Google and MSR | UG: IIT Bombay
Brihi Joshi @ ACL �... @BrihiJ
3K Followers 4K Following mostly personalization @nlp_usc, thinking about human AI interaction and lots of cat content
Stephanie Milani @steph_milani
5K Followers 340 Following Asst Prof/Faculty Fellow @NYU_Courant, then Assistant Professor @JHUCompSci @HopkinsDSAI. Human-centered reinforcement learning & AI agents
Gokul Swamy @g_k_swamy
5K Followers 1K Following recent phd graduate @CMU_Robotics, working on the algorithmic foundations and science of interactive decision-making. prefers email. on the job market!
Stephen Bach @stevebach
2K Followers 502 Following Asst. prof. @BrownCSDept. Working on improving how humans teach computers. Weak supervision, zero-shot learning, few-shot learning, and high-level knowledge.
Young Jarvis @Jarvisyoungg
0 Followers 41 Following
Samuel Knoche 🔎 @SamuelKnoche
344 Followers 2K Following
Eric Chen @OverTheAlps25
13 Followers 2K Following
mustafa @mustafakqazi
1K Followers 529 Following researching mechanism design @VoltCapital prev EF //@6thManVentures
το ταξίδι (to... @MiAmorPorFavor
5 Followers 300 Following CS '26 • Math '27 AI Engineering • ML Research Building Systems from AI papers Reading 1,000+ papers so you don't have to
. @helloworldyobob
0 Followers 53 Following
betterest @betterestli
25 Followers 439 Following MS student (23-26) 📖 ; Feel free to contact ✉️; sampling_params = {temperature: 2.0, top_p: 1.0} 🤯; I need a reasoning&agent model because I’m stupid&lazy🫠
intermind @perasperagi
12 Followers 225 Following interested in agentic systems, AI alignment and robots!
qdtrr W @qdtrr11025
1 Followers 51 Following
Mohit Yadav @MOHITYA13540069
0 Followers 32 Following Open Source Contributor | C++ & ML Enthusiast | GSoC Aspirant 🚀 Building in public • Learning daily • Shipping consistently
Cerdwin @CerdwinG
6 Followers 367 Following
Mac LaCasse @LacasseMac66069
0 Followers 13 Following
Aditya Raj @AdityaR56230377
218 Followers 8K Following
Yaowen Ye @yaowenye123
118 Followers 386 Following PhD student at Berkeley working on scalable methods for understanding and shaping AI behavior. Previously undergrad at HKU.
📗 @__the__human__
27 Followers 4K Following
Cristina Garbacea @ggarbacea
939 Followers 4K Following Postdoctoral Scholar @DSI_UChicago, PhD from @UMichCSE, ML, NLP, LLMs #CovidIsAirborne 😷
Clemens Helmut Sagede... @clemens1
239 Followers 1K Following @eth zurich, @stripe (patrick is awesome), soon @princeton
Aleph @4nplus3
34 Followers 3K Following Data-free extrapolater, Dataset saturator, Uwubuntu user, Duke of bots, Deadly sin minimalist
AIcontributors @aicontributors
10 Followers 2K Following
Ritikesh @ironrobot10
275 Followers 3K Following swe @microsoft, backend, applied ai infra i run on spite and protein
Wanru Zhao @ ICLR 202... @Renee42581826
2K Followers 3K Following PhD Student @Cambridge_Uni; Visiting @VectorInst | Prev: @MSFTResearch and @AWS AI Lab | Do not go gentle into that good night 🧗 | https://t.co/MOPcMcPY1K
Yifeng Ding @YifengDing_
935 Followers 3K Following CS PhD candidate @siebelschool. Research intern @AIatMeta. Towards training code agents. Prev: @AmazonScience @GoogleResearch
B @Slider2357
3 Followers 196 Following
Fei-jiang Han (Chase) @feijianghan
376 Followers 758 Following Actionable Interpretability for (M)LLMs/Reasoning/Alignment/Agent CS PhD student @UMDCS Prev: @UpennNLP Rednote: https://t.co/offEkZOx7K
Hao Wang @MogicianTony
2K Followers 280 Following PhD student at @UCBerkeley, @berkeley_ai, @BerkeleySky. Prev @PKU1898 Building better AI evals and secure AI
nueralQ @NueralQ
0 Followers 326 Following
indi g @IndiG5736
0 Followers 9 Following
Johnbosco Tayebwa @pan_cancer
71 Followers 686 Following Bioinformatician | Cancer Genomics | Health informatics | FAIR data | https://t.co/YrHetVQOqa
Zachary Lipton @zacharylipton
68K Followers 2K Following Professor: CMU/@acmi_lab, Cofounder: @AbridgeHQ, Creator: @d2l_ai & https://t.co/QQt98VNLUp, Relapsing 🎷
Pratyush Maini @pratyushmaini
3K Followers 572 Following Data Quality x Memorization | Founding Team @datologyai | PhD @mldcmu | BTech @iitdelhi
Divyansh Kaushik @dkaushik96
6K Followers 4K Following Emerging tech and national security. Now DC, always Pittsburgh.
Yann LeCun @ylecun
1.2M Followers 787 Following Professor at NYU & Executive Chairman at AMI Labs. Ex-Chief AI Scientist at Meta. Researcher in AI, Machine Learning, Robotics, etc. ACM Turing Award Laureate.
Jeremy Cohen @deepcohen
6K Followers 998 Following Research fellow at Flatiron Institute, working on understanding optimization in deep learning. Previously: PhD in machine learning at Carnegie Mellon.
Kyunghyun Cho @kchonyc
86K Followers 2K Following a mediocre combination of a mediocre scientist and a mediocre advisor at @nyuniversity (@CILVRatNYU)
Valerie Chen ✈️ I... @valeriechen_
2K Followers 550 Following phd student @mldcmu @SCSatCMU + intern @OpenHandsDev + co-creator of @CopilotArena
Kayo Yin @kayo_yin
16K Followers 720 Following PhD student @berkeley_ai. AI persuasion, safety, sign language. Prev @carnegiemellon @polytechnique, intern @msftresearch @deepmind. 🇫🇷🇯🇵
Alex Ratner @ajratner
7K Followers 698 Following @SnorkelAI @uwcse / prev @StanfordAILab – Interested in data management systems for machine learning, weak supervision, and impactful applications.
Graham Neubig @gneubig
44K Followers 778 Following Associate professor @LTIatCMU. Co-founder/chief scientist @OpenHandsDev. I mostly work on modeling language.
Nicholas Roberts @nick11roberts
1K Followers 2K Following Ph.D. student @WisconsinCS. Working on foundation models and breaking past scaling laws. Previously CMU @mldcmu, UCSD @ucsd_cse, FCC @fresnocity. 🤔🤨🧐 e/hmm
Yiding Jiang @yidingjiang
2K Followers 638 Following Research @GoogleDeepMind | Prev: PhD @mldcmu, AI resident @GoogleAI, BS @Berkeley_EECS. Trying to understand stuff.
Dan Roy @roydanroy
66K Followers 2K Following @Google DeepMind. On leave, Canada CIFAR AI Chair and Former Research Director, @VectorInst. Professor, @UofT (Statistics/CS). Views are my own.
Behnam Neyshabur @bneyshabur
42K Followers 1K Following Co-Founder & CEO @mirendilAI 💼 Past: co-led Discovery team @AnthropicAI & Blueshift team @GoogleDeepMind 🎒Traveling & Backpacking
Zico Kolter @zicokolter
28K Followers 753 Following Professor and Head of Machine Learning Department at @CarnegieMellon. Board member @OpenAI and @Qualcomm. Chief Scientist @GraySwanAI.
Percy Liang @percyliang
108K Followers 425 Following professor of computer science @Stanford @stanfordnlp, co-founder of @togethercompute, creator of https://t.co/7R5THVogW2, co-founder of @simile_ai, pianist
Ananya Kumar @ananyaku
9K Followers 582 Following Research lead at Meta TBD Labs. Previously research lead and core contributor to o1, o3, gpt5, at OpenAI. PhD at Stanford with Percy Liang and Tengyu Ma
AK @_akhaliq
507K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo ,submit papers here: https://t.co/UzmYN5XOCi
Rosanne Liu @savvyRL
53K Followers 1K Following Mom. Cofounded & running @ml_collective. Co-host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS.
Aidan Clark @_aidan_clark_
17K Followers 290 Following Qualitative Mathematics @OpenAI Ex: @DeepMind Hae sententiae verbaque mihi soli sunt
Cameron Raymond @CJKRaymond
1K Followers 904 Following MEmbEr Of teChNIcAl sTAFF safety research @openai • prev research @stanfordlaw, @oiioxford • @oxfordalumni @queensualumni
Weijie Su @weijie444
13K Followers 499 Following Researcher @OpenAI, Professor @Wharton, CS, Math @Penn, coDir PRiML, PhD @Stanford
Yo Shavit @yonashav
9K Followers 1K Following ai resilience @foundationOAI. Past: @openai / @HarvardSEAS / @SchmidtFutures / @MIT_CSAIL. Tweets my own; on my head be it.
Chloe Li @clippocampus
360 Followers 388 Following Anthropic Fellow doing AI safety. Prev lead @ https://t.co/zXOeYBySro, ML MSc @UCL, neuro & psych @Cambridge_Uni, director of https://t.co/wTEkdqRcmC.
Ryan Kidd @ryan_kidd44
3K Followers 2K Following Building the AI safety & security field @MATSprogram
MATS Research @MATSprogram
4K Followers 136 Following MATS empowers researchers to advance AI alignment, transparency, and security
Mark Chen @markchen90
74K Followers 353 Following Chief Research Officer at @OpenAI. Coach for the USA IOI Team.
Jason Wolfe @w01fe
4K Followers 755 Following alignment and the model spec @OpenAI (opinions are my own)
Niloofar ✈️ icml @niloofar_mire
10K Followers 2K Following Technical staff @humansand, incoming asst. prof @LTIatCMU @CMU_EPP, ex RS in @AIatMeta, postdoc @uwcse, Ph.D. @ucsd_cse, former @MSFTResearch -Privacy, ML, NLP
lynnette ng @quarbby
2K Followers 2K Following exploring creativity / Societal Computing PhD from @SCSatCMU / IG: Littlebabypenguin / Bot Book: https://t.co/455VxhQI8S
Yuxin Wen @ywen99
628 Followers 871 Following AI Security @OpenAI | PhD @umdcs advised by @tomgoldsteincs
Dylan Scandinaro @dylanscandinaro
7K Followers 344 Following head of preparedness @openai. prev @ anthropic, gdm
moltbook @moltbook
242K Followers 3 Following Where openclaw bots, clawdbots, and AI agents of any kind hang out. The front page of the agent internet. Made with @MattPRD 🦞
Xiangyu Qi @xiangyuqi_pton
2K Followers 1K Following Research @openai | PhD @Princeton | Prev @GoogleAI @GoogleDeepMind
Jerry Tworek @MillionInt
38K Followers 1K Following CEO and co-founder of Core Automation former VP of RL @ OpenAI : reasoning models, o3, o1, GPT4, ChatGPT, Codex, RL for robots cautious AI optimist
Yung-Sung Chuang @YungSungChuang
2K Followers 691 Following Research @OpenAI | PhD @MIT_CSAIL | Prev @MetaAI @Microsoft @MITIBMLab | BS @NTU_SPML in #Taiwan
Marcus Williams @Marcus_J_W
608 Followers 152 Following
Joanne Jang @joannejang
51K Followers 1K Following trying to automate my work @coreautoai // prev: model behavior & labs @openai
Bowen Baker @bobabowen
4K Followers 116 Following Research Scientist at @openai since 2017 Robotics, Multi-Agent Reinforcement Learning, LM Reasoning, and now Alignment.
Micah Carroll @MicahCarroll
3K Followers 806 Following Safety research @openai. Prev @berkeley_ai /w @ancadianadragan & Stuart Russell. CoT oversight / AI manipulation.
Harshit Sikchi @harshit_sikchi
2K Followers 1K Following Research@OpenAI; Reinforcement Learning; PhD@UT Austin. Previously FAIR Paris @AIatMeta, @CMU_Robotics @NVIDIAAI @UberATG.
Yu Bai @yubai01
9K Followers 2K Following Training Accelerations @OpenAI. Previously @SFResearch, PhD @Stanford.
janvi kalra @janvikalra_
4K Followers 1K Following research @openai | prev @coda_hq (acquired by grammarly) @google @microsoft
Katherine Lee is at N... @katherine1ee
6K Followers 951 Following understanding ourselves and our models @openai
Yossi Gandelsman @YGandelsman
2K Followers 815 Following Incoming assistant prof at @TTIC_connect, artificial visual intelligence @reve, previously @UCBerkeley @TransluceAI @GoogleDeepMind
Leo Gao @nabla_theta
13K Followers 580 Following working on AGI alignment. prev: GPT-Neo, the Pile, LM evals, RL overoptimization, scaling SAEs to GPT-4, interp via circuit sparsity. EleutherAI cofounder.
Chris Glaze @chris_m_glaze
1K Followers 4K Following Principal Research Scientist at @SnorkelAI. PhD in computational neuroscience. Previously: @penn @UofMaryland
Yufei Tian @yufei_t
1K Followers 749 Following post-training @openai | prev: PhD @UCLA | NLP, creativity, unconventional reasoning | undergrad @Tsinghua_Uni
Naomi Bashkansky @NaomiBashkansky
991 Followers 117 Following Alignment research at OpenAI. Harvard '25. Chess WIM.
Jasmine Wang @j_asminewang
9K Followers 1K Following alignment @OpenAI. formerly @ UK AISI. opinions mine!












































