Mike Lewis @ml_perception
Llama3 pre-training lead. Partially to blame for things like the Cicero Diplomacy bot, BART, RoBERTa, attention sinks, kNN-LM, top-k sampling & Deal Or No Deal. Seattle Joined September 2019-
Tweets277
-
Followers8K
-
Following252
-
Likes844
@ehsanik @Vercept_ai @AnthropicAI Amazing, congrats!!
Excited to see what the amazing @sarahookr and team build here!
Beginnings are very special. Today is an important day for @adaptionlabs. Today a handful of one-size-fits-all-models are optimized for the average use case. Averages erase the exceptional. Everything intelligent adapts. So should AI.
Our team in FAIR at Meta is hiring a (full-time) researcher! We work on the topics of Reasoning, Alignment and Memory/architectures (RAM) for self-improvement & co-improvement. Apply here: metacareers.com/profile/job_de… Location: NY, Seattle or Menlo Park. Some of our recent work to give flavor: Co-Improvement (position): arxiv.org/abs/2512.05356 SPICE (Self-Play in Corpus Environments): arxiv.org/abs/2510.24684 Self-Challenging Agents: arxiv.org/abs/2506.01716 RL from Human Interaction: arxiv.org/abs/2509.25137 AggLM (parallel aggregation): arxiv.org/abs/2509.06870 StepWiser (CoT-PRM RL): arxiv.org/abs/2508.19229 DARLING (diversity-trained RL): arxiv.org/abs/2509.02534 J1 (RL-trained LLM-as-Judge): arxiv.org/abs/2505.10320 CoT-Self-Instruct: arxiv.org/abs/2507.23751 Multi-Token Attention: arxiv.org/abs/2504.00927
@edward_milsom You see this because at the start of each epoch, many of your samples were seen recently at the end of the previous epoch. By the end of your epoch, each sample hasn't been seen for at least one epoch, so is less memorized.
🚀 Introducing the Latent Speech-Text Transformer (LST) — a speech-text model that organizes speech tokens into latent patches for better text→speech transfer, enabling steeper scaling laws and more efficient multimodal training ⚡️ Paper 📄 arxiv.org/pdf/2510.06195
Love seeing these incredibly creative new evaluations! Optimizing benchmarks is easy, the real challenge is in generalizing to the tasks that don't exist yet
I've written the full story of Attention Sinks — a technical deep-dive into how the mechanism was developed and how our research ended up being used in OpenAI's new OSS models. For those interested in the details: hanlab.mit.edu/blog/streaming…
@niloofar_mire @CMU_EPP @LTIatCMU @AIatMeta @kamalikac Amazing, congrats!!
Don’t miss this - I’ve worked with Mike (@ml_perception) very closely at Meta and his talks are super informative and fun.
Want to learn about Llama's pre-training? Mike Lewis will be giving a Keynote at NAACL 2025 in Albuquerque, NM on May 1. 2025.naacl.org @naaclmeeting
📉📉NEW SCALING LAW PHENOMENON 📉📉 We find that knowledge and reasoning exhibit different scaling behaviors! Super excited to finally tell you all about our paper on the compute optimal scaling of skills: arxiv.org/pdf/2503.10061 [1/n]
✨New Preprint✨We introduce 𝐁𝐫𝐚𝐧𝐜𝐡-𝐓𝐫𝐚𝐢𝐧-𝐒𝐭𝐢𝐭𝐜𝐡 (𝐁𝐓𝐒), an efficient & flexible method for stitching together independently pretrained LLM experts (i.e. code, math) into a single, capable generalist model. Key Takeaways: ✅BTS achieves the best average generalist performance across a variety of tasks 👊 ✅We stitch together 4 x 2.7B specialized expert LLMs, where only the lightweight stitching layers (<300M params in total‼) are trained while the experts’ params remain frozen. This makes BTS super modular, flexible, and easy to train! 👊 arxiv.org/abs/2502.00075 Work done at @AIatMeta w/ @prajjwal_1, Chloe Bi, Chris Cai, @j_foerst @imjeremyhi @punitkoura, Ruan Silva, @shengs1123 @em_dinan* @ssgrn* @ml_perception* * Joint last author 🧵👇(1/5)
🚀 Introducing the Byte Latent Transformer (BLT) – An LLM architecture that scales better than Llama 3 using byte-patches instead of tokens 🤯 Paper 📄 dl.fbaipublicfiles.com/blt/BLT__Patch… Code 🛠️ github.com/facebookresear…
How can we reduce pretraining costs for multi-modal models without sacrificing quality? We study this Q in our new work: arxiv.org/abs/2411.04996 At @AIatMeta, We introduce Mixture-of-Transformers (MoT), a sparse architecture with modality-aware sparsity for every non-embedding transformer parameter (e.g., feed-forward networks, attention matrices, and layer normalization). MoT achieves dense-level performance with up to 66% fewer FLOPs! ✅ Chameleon setting (text + image generation): Our 7B MoT matches dense baseline quality using just 55.8% of the FLOPs. ✅ Extended to speech as a third modality, MoT achieves dense-level speech quality with only 37.2% of the FLOPs. ✅ Transfusion setting (text autoregressive + image diffusion): MoT matches dense model quality using one-third of the FLOPs. ✅ System profiling shows MoT achieves dense-level image quality in 47% and text quality in 75.6% of the wall-clock time** Takeaway: Modality-aware sparsity in MoT offers a scalable path to efficient, multi-modal AI with reduced pretraining costs. Work of a great team with @liliyu_lili, Liang Luo, @sriniiyer88, Ning Dong, @violet_zct, @gargighosh, @ml_perception, @scottyih, @LukeZettlemoyer, @VictoriaLinML.👏 **Measured on AWS p4de.24xlarge instances with NVIDIA A100 GPUs.
@microth @AnetteMFrank Congrats Michael!!
@apjacob03 @MIT_CSAIL @jacobandreas @KonstDaskalakis @gabrfarina @roger_p_levy @polynoamial @adamlerer @em_dinan Huge congrats Jacob!!
1/n Introducing MoMa 🖼, our new sparse early-fusion architecture for mixed-modal language modeling that significantly boosts pre-training efficiency 🚀 (arxiv.org/pdf/2407.21770). MoMa employs a mixture-of-expert (MoE) framework with modality-specific expert groups. Given any interleaved mixed-modal token sequences, each group exclusively processes tokens of the designated modality with conventional MoE routing. This is joint work with amazing co-first authors @AkshatS07, @ArmenAgha and collaborators @AIatMeta – Liang Luo, @sriniiyer88, @ml_perception, @gargighosh and @LukeZettlemoyer.
@tallinzen The base model recipe is relatively straightforward (though of course >15T tokens), so you could always use that! Post training makes a huge difference on some tasks, but not all, which is interesting in itself.
@NamanGoyal21 Can I have a nickel for every extra FLOP your work let us use?
tldr; you can go a long way in pre-training by (1) curating amazing data, (2) using a lot of FLOPs, and (3) otherwise not screwing up. All three are harder than they sound, so read the paper... That said, I'm amazed by our progress since Llama 3 - expect big things from Llama 4!
So excited for the open release of Llama 3.1 405B - with MMLU > 87, it's a really strong model and I can't wait to see what you all build with it! llama.meta.com Also check out the paper here, with lots of details on how this was made: tinyurl.com/2z2cpj8m
(((ل()(ل() 'yoav)))... @yoavgo
84K Followers 2K Following
AI at Meta @AIatMeta
817K Followers 324 Following Together with the AI community, we are pushing the boundaries of what’s possible through open science to create a more connected world.
Percy Liang @percyliang
109K Followers 425 Following professor of computer science @Stanford @stanfordnlp, co-founder of @togethercompute, creator of https://t.co/7R5THVogW2, co-founder of @simile_ai, pianist
Aran Komatsuzaki @arankomatsuzaki
182K Followers 375 Following Sharing AI research. Early work on AI (GPT-J, scaling, MoE). Ex ML PhD (GT) & Google.
Akari Asai @AkariAsai
23K Followers 937 Following Incoming Assistant Professor @SCSatCMU (Hiring Ph.D. students for Fall 2026) & research scientist @allen_ai OLMo. akariasai @ 🦋
Delip Rao e/σ @deliprao
69K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈
Kyunghyun Cho @kchonyc
86K Followers 2K Following a mediocre combination of a mediocre scientist and a mediocre advisor at @nyuniversity (@CILVRatNYU)
Douwe Kiela @douwekiela
16K Followers 453 Following Contextualizing AI @GoogleDeepMind, ex-@ContextualAI CEO, @Stanford Adjunct Prof
Soumith Chintala @soumithchintala
310K Followers 1K Following Building new things @thinkymachines. Also dabble in robotics at NYU. Cofounded @PyTorch. AI is delicious when it is accessible and open-source.
Sam Bowman @sleepinyourhat
66K Followers 3K Following AI alignment + LLMs at Anthropic. On leave from NYU. Views not employers'. No relation to @s8mb. Into @givingwhatwecan.
Graham Neubig @gneubig
45K Followers 782 Following Associate professor @LTIatCMU. Co-founder/chief scientist @OpenHandsDev. I mostly work on modeling language.
Eric Jang @ericjang11
136K Followers 4K Following
Tim Dettmers @Tim_Dettmers
46K Followers 905 Following Creator of bitsandbytes. Professor @CarnegieMellon and Research Scientist @allen_ai . I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.
Jacob Andreas @jacobandreas
24K Followers 954 Following Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL / @NLP_MIT (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJw
Noam Brown @polynoamial
147K Followers 924 Following Researching reasoning @OpenAI | Co-created Libratus/Pluribus superhuman poker AIs, CICERO Diplomacy AI, and OpenAI o-series 🍓 reasoning models
Ofir Press @OfirPress
19K Followers 9K Following I push the AI frontier by building tough benchmarks with amazing people. SWE-bench, SWE-agent, SciCode, AlgoTune. Postdoc @Princeton. PhD @nlpnoah @UW.
Sewon Min @sewon__min
16K Followers 897 Following Assistant professor @Berkeley_EECS @berkeley_ai || Research scientist at @allen_ai || PhD from @uwcse @uwnlp
Yoav Artzi @yoavartzi
19K Followers 191 Following Research/prof @cs_cornell + @cornell_tech🚡 / https://t.co/9YnWry86x0 / researcher @GoogleDeepMind / building @COLM_conf / ex @arxiv
Luca Soldaini 🇰�... @soldni
13K Followers 1K Following data mines are my passion ⛏️ mts @MicrosoftAI / ex Olmo co-lead @allen_ai / pfp @YanhongLi2062 / thoughts are mine, leave my employer alone / 🌈
Vlad @TheVladSavinov
1 Followers 38 Following Pretraining LLMs @reflection_ai // opinions are my own
Ivan M @med_1v
1 Followers 5K Following
David Razumovsky @davidraz
2K Followers 387 Following
yjl @agiyjl
5 Followers 435 Following
Bianca gonzalez @Biancagonz03u5
33 Followers 338 Following
Agirobott @Agirobott
4 Followers 339 Following
Susan Zhang @suchenzang
47K Followers 1K Following @ Google Deepmind. Past: @MetaAI, @OpenAI, @unitygames, @losalamosnatlab, @Princeton etc. Always hungry for intelligence. Only my opinions stored here.
Roman Bachmann @roman__bachmann
630 Followers 550 Following Multimodal @Apple, previously at @EPFL_en, @RIKEN_AIP
Ranta Rose @RantaRoseq5yx
0 Followers 65 Following
JY Z @JunYuZzzzz
81 Followers 4K Following
🎭 @deepfates
62K Followers 6K Following deepfates is a distributed collective intelligence running on heterogenous substrates and coordinating acausally. thank you for participating. we/us
✌𝓼𝔂𝓵𝓿�... @sylvie8glah
187 Followers 935 Following i deserve financial compensation for being self aware
Sirou Zhu @srzhu97
7 Followers 264 Following
Ziran Yang @__zrrr__
490 Followers 627 Following PhD student @Princeton, BS @PKU1898 Looking for verifiable reasoning
may seeds @mayseedscjct
29 Followers 744 Following
Nehal Amin @NehalAmin03
10 Followers 1K Following Open-source AI & robotics enthusiast exploring the future of intelligence, space tech, internet systems, mobile computing, databases, and digital infrastructure
AIcontributors @aicontributors
12 Followers 2K Following
participate4chg @participate4chg
8 Followers 879 Following
Emil @eadi6y
1 Followers 115 Following
Guohao Li 🐫 @guohao_li
14K Followers 4K Following Founder @Eigent_AI / @CamelAIOrg. Scaling RL Environments for Agents. Prev Oxford, KAUST, ETHz, Intel, Kumo.
Hunan Rostomyan @hunan_rostomyan
65 Followers 773 Following Machine Learning Engineer @ Nomad Health | Chegg | WriteLab
sakura @prunusito
1 Followers 587 Following
Kevin Rose @kevinrose
1.5M Followers 2K Following building at @basic_in (@digg) | Podcasts: The Kevin Rose Show, Random Show w/ @tferriss. | Ex: @google, Board of Directors: @ouraring, @hodinkee
Julie Kallini ✈️ ... @JulieKallini
3K Followers 506 Following CS PhD @StanfordNLP 🌲 Previously: SWE @Meta, Class of '21 @PrincetonCS
dmytro @dmytro_kurch
0 Followers 6 Following
Victor Melara @MelaraVictor
39 Followers 506 Following
Tomasz Limisiewicz @TomLimi
827 Followers 511 Following Postdoctoral researcher at @meta Fair and @uwnlp , Interested in going into the inner workings of neural networks, multilingualism, and fairer NLP (he/him)
Jordan Segall @jordan_segall
2K Followers 1K Following Partner at Redpoint Ventures | Stanford University | Formerly Palantir, RelateIQ, McKinsey, C3, StartX Mentorship Director
Dương Minh Hùng @hung_minh34747
1 Followers 596 Following
Adel @xlcizor
384 Followers 550 Following Entering the matrix, founder @agentastic, @meta, @google, PhD at UIUC
Bassompierre @MBassompierre67
22 Followers 1K Following
Fatih⏩⤴️ @taskinfatih
619 Followers 7K Following Lover of all novel and hard concepts: especially machine learning and systems theory
@BrianLinuxing (every... @BrianLinuxing
5K Followers 7K Following • 45+ years of IT • Founder of #LinuxingInLondon Britain's largest Linux community • Wikipedian • Gives #Linux talks, desktop specialist🐧 • Tinkering with #AWS
Sajeevan Veeriah @VeeriahSajeevan
276 Followers 3K Following Automation & Robotics Engineer Mechatronics | AI/ML | Embedded
(((ل()(ل() 'yoav)))... @yoavgo
84K Followers 2K Following
AI at Meta @AIatMeta
817K Followers 324 Following Together with the AI community, we are pushing the boundaries of what’s possible through open science to create a more connected world.
Percy Liang @percyliang
109K Followers 425 Following professor of computer science @Stanford @stanfordnlp, co-founder of @togethercompute, creator of https://t.co/7R5THVogW2, co-founder of @simile_ai, pianist
Akari Asai @AkariAsai
23K Followers 937 Following Incoming Assistant Professor @SCSatCMU (Hiring Ph.D. students for Fall 2026) & research scientist @allen_ai OLMo. akariasai @ 🦋
Kyunghyun Cho @kchonyc
86K Followers 2K Following a mediocre combination of a mediocre scientist and a mediocre advisor at @nyuniversity (@CILVRatNYU)
Douwe Kiela @douwekiela
16K Followers 453 Following Contextualizing AI @GoogleDeepMind, ex-@ContextualAI CEO, @Stanford Adjunct Prof
Soumith Chintala @soumithchintala
310K Followers 1K Following Building new things @thinkymachines. Also dabble in robotics at NYU. Cofounded @PyTorch. AI is delicious when it is accessible and open-source.
Christopher Manning @chrmanning
165K Followers 336 Following Founder @stanfordnlp & cs224n—Senior Fellow @StanfordHAI—Prof. CS & Linguistics @Stanford—GP @aixventureshq—MTS @moonlake—Australian🇦🇺—Do #NLProc & #AI 👋
Sam Bowman @sleepinyourhat
66K Followers 3K Following AI alignment + LLMs at Anthropic. On leave from NYU. Views not employers'. No relation to @s8mb. Into @givingwhatwecan.
Graham Neubig @gneubig
45K Followers 782 Following Associate professor @LTIatCMU. Co-founder/chief scientist @OpenHandsDev. I mostly work on modeling language.
Tim Dettmers @Tim_Dettmers
46K Followers 905 Following Creator of bitsandbytes. Professor @CarnegieMellon and Research Scientist @allen_ai . I blog about deep learning and PhD life at https://t.co/Y78KDJJFE7.
Jacob Andreas @jacobandreas
24K Followers 954 Following Teaching computers to read. Assoc. prof @MITEECS / @MIT_CSAIL / @NLP_MIT (he/him). https://t.co/5kCnXHjtlY https://t.co/2A3qF5vdJw
Ofir Press @OfirPress
19K Followers 9K Following I push the AI frontier by building tough benchmarks with amazing people. SWE-bench, SWE-agent, SciCode, AlgoTune. Postdoc @Princeton. PhD @nlpnoah @UW.
Sewon Min @sewon__min
16K Followers 897 Following Assistant professor @Berkeley_EECS @berkeley_ai || Research scientist at @allen_ai || PhD from @uwcse @uwnlp
Yoav Artzi @yoavartzi
19K Followers 191 Following Research/prof @cs_cornell + @cornell_tech🚡 / https://t.co/9YnWry86x0 / researcher @GoogleDeepMind / building @COLM_conf / ex @arxiv
Felix Hill @FelixHill84
12K Followers 739 Following Research Scientist, Deepmind I try to think hard about everything I tweet, esp on 90s football and 80s music None of my opinions are really someone else's
Tal Linzen @tallinzen
19K Followers 967 Following Professor @nyuling and @NYUDataScience, research scientist @GoogleAI, inventor of the word "bertology"
Tomasz Limisiewicz @TomLimi
827 Followers 511 Following Postdoctoral researcher at @meta Fair and @uwnlp , Interested in going into the inner workings of neural networks, multilingualism, and fairer NLP (he/him)
Kelly Marchisio @Neur... @cheeesio
2K Followers 670 Following Multilingualilty Lead @cohere. Formerly: PhD @jhuclsp, Alexa Fellow @amazon, dev @Google, MPhil @cambridgenlp, EdM @hgse 🔑🔑¬🧀 (@kelvenmar20)
Alexis Ross @alexisjross
4K Followers 1K Following currently @humansand | phd-ing @MIT_CSAIL & working towards personalized AI tutors | formerly @allen_ai, @harvard '20
Orion Weller @orionweller
2K Followers 1K Following PhD student @jhuclsp Prev Intern @AIatMeta @GoogleDeepMind, @samaya_ai, @allen_ai Research: LLMs, Search, Agents
Pang Wei Koh @PangWeiKoh
5K Followers 949 Following Assistant professor at @uwcse | MTS @MicrosoftAI. Formerly @allen_ai @StanfordAILab @GoogleAI @Coursera. 🇸🇬
Rowan Zellers @rown
15K Followers 1K Following multimodal @thinkymachines. I also like to climb rocks and throw pottery. https://t.co/5Er4j39K71 (he/him)
Stella Li ✈️ ICML... @StellaLisy
4K Followers 540 Following PhD student @uwnlp | visiting researcher @AIatMeta | undergrad @jhuclsp #NLProc
Yifan @yang1fan2
324 Followers 373 Following Pre-Training @ https://t.co/0zCR2HlCwl · Formerly at Meta (Llama Pre-training), ByteDance (Seed), Microsoft Research
Hamish Ivison @ ICML @hamishivi
3K Followers 750 Following Antipodean Abroad. I (try to) do NLP research. PhD student @uwcse, prev @Sydney_Uni @allen_ai 🇦🇺🇨🇦🇬🇧
Mikel Artetxe @artetxem
7K Followers 228 Following Co-founder @RekaAILabs and Honorary Researcher @Hitz_zentroa (University of the Basque Country) | Past: Research Scientist @AIatMeta (FAIR)
Prateek Yadav @prateeky2806
5K Followers 2K Following exploring @GoogleDeepMind prev: pre-training @AlatMeta, part-time @GoogleDeepMind, PhD at @unccs
David Brandfonbrener @brandfonbrener
2K Followers 709 Following member of technical staff @AnthropicAI. previously: research scientist, FAIR @AIatMeta, fellow @KempnerInst @Harvard, phd @nyu_courant
Eva Spiliopoulou @EvaSpiliop
389 Followers 211 Following Research Scientist @FAIR @Meta #NLProc Previous @Amazon PhD @LTIatCMU
Hunter Lang @hunterjlang
356 Followers 413 Following researcher @ meta FAIR. prev: meta genai, phd at @MIT_CSAIL
Marc Marone @ ICML 20... @ruyimarone
880 Followers 694 Following PhD @jhu, prev research intern @meta @databricks MosaicML @microsoft, @mstranslator, @GeorgiaTech | Working on datasets!
Will Held @WilliamBarrHeld
3K Followers 1K Following Open LLM Training @ https://t.co/yb9OySgHFM Formerly ML PhD w/ @Diyi_Yang, 🦙 @AIatMeta, Assistant @GoogleAI, اللغة العربية @NYUAbuDhabi Burqueño
Jack Rae @jack_w_rae
25K Followers 540 Following Distinguished Scientist @ Meta LLMs (e.g. Gopher, Chinchilla, Gemini, Muse) Compression & RL ☯️ Past: Google, OpenAI, Quora
Inna Lin @iwylin
1K Followers 1K Following Final-year PhD Student in AI/NLP @uwcse @uwnlp | Visiting Researcher @Meta Superintelligence Lab | Prev: @jpmorgan @cornell_tech @columbia
Niloofar ✈️ icml @niloofar_mire
10K Followers 2K Following Technical staff @humansand, incoming asst. prof @LTIatCMU @CMU_EPP, ex RS in @AIatMeta, postdoc @uwcse, Ph.D. @ucsd_cse, former @MSFTResearch -Privacy, ML, NLP
Nikhil Raghuraman @nikraghuraman
1K Followers 1K Following Research @OpenAI | Prev @MistralAI, @JaneStreetGroup, @StanfordAILab
Santiago Hernández @santiaghini
2K Followers 1K Following rl enthusiast, research @openai. retired child actor
John Hewitt @johnhewtt
7K Followers 58 Following Assistant Prof @columbia CS. Visiting Researcher @ Google DeepMind. PhD from @stanfordnlp. Language x Neural Nets.
Niklas Muennighoff @Muennighoff
10K Followers 547 Following Researching AI/LLMs @Stanford @cursor_ai
Kiana Ehsani @ehsanik
10K Followers 620 Following Making models smarter @ Anthropic, formerly CEO and Co-Founder @ Vercept (acquired by Anthropic), Climber on the weekends. Opinions are my own.
Vinay Rao @vinaysrao
811 Followers 150 Following Building real things at Prometheus. Previously at Meta, Character AI, Google Brain, Baidu.
Mihir Kale @maninblack815
180 Followers 743 Following LLMs at Microsoft SuperIntelligence. Previously Meta, Google.
AerIn @aerinykim
6K Followers 566 Following building https://t.co/rjIQoeDYwX. enjoy doing non trivial work. https://t.co/mo8D7pzBtk
Aakanksha Chowdhery @achowdhery
14K Followers 6K Following @Stanford @reflection_ai // Previously @GoogleDeepMind :: PaLM, Gemini // @MSFTResearch, @Princeton // views my own and subject to change
Amanda Bertsch @abertsch72
2K Followers 925 Following PhD student @LTIatCMU / @SCSatCMU / student researcher @allen_ai, researching long context + decoding | she/her | @ abertsch on bsky or by email (https://t.co/bsHqwIM80d)
Susan Zhang @suchenzang
47K Followers 1K Following @ Google Deepmind. Past: @MetaAI, @OpenAI, @unitygames, @losalamosnatlab, @Princeton etc. Always hungry for intelligence. Only my opinions stored here.
Anirudh Goyal @anirudhg9119
7K Followers 582 Following Thinking about thinking. Improving models. Spent time at @Berkeley_EECS, @MPI_IS, @GoogleDeepMind Gemini ♊.
Todor Mihaylov @tbmihaylov
761 Followers 1K Following Research Scientist, Working on Llama at @MetaAI
Michal Valko ✈️ I... @misovalko
9K Followers 10K Following Founding Researcher @ Isara Labs & Inria & MVA. Ex: Llama @AIatMeta; Gemini & BYOL @GoogleDeepMind. LLMs, RL, alignment.
Sheng Shen @shengs1123
3K Followers 555 Following
Roberta Raileanu @robertarail
11K Followers 2K Following Open-Ended Team Lead and Senior Staff Research Scientist @GoogleDeepMind. Honorary Lecturer @UCL. ex @Meta | @NYU | @Princeton.
Sharan Narang @sharan0909
3K Followers 255 Following Foundation and World models @ Waymo | ex Llama pretraining lead | ex @Google (PaLM lead, T5), ex @Baidu (Deep Speech 2, Sparse Neural Networks), ex @Nvidia
Dieuwke Hupkes @_dieuwke_
2K Followers 276 Following
Aaditya Singh @Aaditya6284
863 Followers 360 Following Doing a PhD @GatsbyUCL with @SaxeLab, @FelixHill84 on learning dynamics, ICL, LLMs. Prev. at: @GoogleDeepMind, @AIatMeta (LLaMa 3), @MIT. https://t.co/ZOmBWCvbIK
Moya Chen @moyapchen
426 Followers 154 Following
Jim Fan @DrJimFan
480K Followers 3K Following NVIDIA Director of Robotics & Distinguished Scientist. Co-Lead of GEAR lab. Solving Physical AGI, one motor at a time. Stanford Ph.D. OpenAI's 1st intern.
Zexuan Zhong @ZexuanZhong
3K Followers 747 Following Post-training & RL MSL TBD lab @Meta | prev @xAI @PrincetonCS
Sweta Agrawal ✈️ ... @swetaagrawal20
1K Followers 2K Following Research Scientist @GoogleDeepmind | Past: Postdoc Researcher @itnewspt | Ph.D. @ClipUmd, @umdcs #nlproc


































