@SimonHoiberg if anyone changes pricing, it's one /goal codex session away from me switching. try it yourself, you can do verified migrations between basically any service with an API
things are not the same anymore. engineering and reliability are much much easier
@ziv_ravid@finbarrtimbers@DarioAmodei imo, the better scoped the task is and the better my start prompt is before I /goal with codex, the more likely it is I zero shot accept the output.
I assume Mythos+1 makes it true internally within anth.
I’ll say he was right
PM: gosh I hope that this Jira ticket doesn’t take the software engineer a disproportionate amount of time to complete at the disproportionate amount of time factory
@finbarrtimbers@ziv_ravid@DarioAmodei Keep in mind Mythos has been internal since late Jan, and given how strong the fable distill from Mythos was, it’s not surprising to see why that extrapolation was made especially with RSI working well
@iamgingertrash does not hold when diverse strains in the poors are stomped out by a function overselecting for exploitation over exploration.
think of the middle class like a channel between two worlds which is disappearing
largely agree however, given good mobility between social strata
My personal goalpost for AGI: the day a critical mass of people confidently hand over all their cash to the machine god so that it can gamble in the capital markets
@packyM Yes, but what about “a vague problem someone else picked” gives it away? Having stared at codex / Claude code output I can tell too but I don’t really understand why
one of the most pressing issues plaguing today’s evals is the extreme $ / time cost per eval sample. Needing an entire agent rollout seems extremely slow, especially when you’re trying to use said eval as a signal for posttraining or dev splits for GEPA style setups (cc: @lateinteraction)
I don’t know if there’s a solution to this, but it seems desperately needed
I like to think of it from the perspective of human interviews. Interviewing for L9 doesn’t take exponentially more time than L3. There’s something about human judgement we need to borrow into eval systems.
Have we ever explored adversarially adaptive evals well?
@kurissuuu@interaction Same, I tried asking it for info to set up better flows, and it got right into insulting me for the subscription price I pay lol
@giffmana did ant have a really big high quality semi-manual re-write in 2023 that they never bothered to redo
is that why claude permanently has boomer-brain?
@beffjezos yea.. then again, you need a house to live in, a pet to be happy, 2 vacations a year, a bunch of food to sustain you and your performance is highly based on your current mood and biological condition + you can get sick and tired
silicon is upon us and it’s upon us to merge
927K Followers 6K FollowingPresident & CEO @ycombinator —Founder @garryslist—Creator of GStack & GBrain—designer/engineer who helps founders—SF Dem accelerating the boom loop
188 Followers 331 FollowingUCL neuro PhD now in industry (ML lead). the brain was was too hard to crack, so I switched to artificial nets; interested in NLP and neuroai - all views my own
82 Followers 169 FollowingUniversity of Maryland '25
Former founding engineer at https://t.co/SbPUeusu4I (YC S25)
Built - https://t.co/XxrA70yGms , https://t.co/JoZKDF2YLj
10K Followers 620 FollowingMaking models smarter @ Anthropic, formerly CEO and Co-Founder @ Vercept (acquired by Anthropic), Climber on the weekends.
Opinions are my own.
5K Followers 17 FollowingPowering concierge customer experiences for the most impactful companies in the world.
Backed by @a16z, @accel, @baincapVC, @coatuemgmt, @indexventures
5K Followers 150 Following@OpenAI Applied AI International Lead, Startups.
Prev CTO/founder, @BCG, @UniofOxford.
Small sparks ✨ & just working things out
22K Followers 4K FollowingMember of Technical Staff (Agents), Research @sullyai
Building AI medical employees
Previously worked on AI persona development & HCI @ 99Ravens
22K Followers 611 FollowingComputer systems person, interaction designer. @thinkymachines
prev founding eng @modal
→ dreams of: a simpler, more honest, more human sort of software
69K Followers 3K FollowingWe're in a race. It's not USA vs China but humans and AGIs vs ape power centralization.
@deepseek_ai stan #1, 2023–Deep Time
«C’est la guerre.» ®1