Sumuk @sumukx

continual learning research @google / prev @PrimeIntellect @huggingface | opinions my own sumuk.org San Francisco, CA Joined September 2023

Tweets

1K
Followers

656
Following

875
Likes

3K

Sumuk @sumukx

18 hours ago

@SimonHoiberg if anyone changes pricing, it's one /goal codex session away from me switching. try it yourself, you can do verified migrations between basically any service with an API things are not the same anymore. engineering and reliability are much much easier

0 0 0 37 0

View Details

Sumuk @sumukx

a week ago

promised land

Max Hodak @maxhodak_

a week ago

67 F

25 16 774 234K 116

0 0 0 117 0

View Details

Sumuk @sumukx

2 weeks ago

@willccbb to be fair the world is also getting crazy these days

0 0 0 33 0

View Details

Sumuk @sumukx

2 weeks ago

@thejessezhang cannot, because of cali laws i believe also why no trading firm is in cali curious to see which of these gives first

1 0 0 850 0

View Details

Sumuk @sumukx

2 weeks ago

@ziv_ravid @finbarrtimbers @DarioAmodei imo, the better scoped the task is and the better my start prompt is before I /goal with codex, the more likely it is I zero shot accept the output. I assume Mythos+1 makes it true internally within anth. I’ll say he was right

0 0 0 89 0

View Details

Sumuk @sumukx

2 weeks ago

PM: gosh I hope that this Jira ticket doesn’t take the software engineer a disproportionate amount of time to complete at the disproportionate amount of time factory

Florian Brand @xeophon

2 weeks ago

I LOVE THE BENCHMARKS THAT HAVE LIKE TWO RIGHT TAIL TASKS THAT TAKE 10X THE TIME COMPARED TO THE OTHERS!!!!!! I LOVE WAITING!!!!!!

8 0 136 7K 9

0 0 0 105 0

View Details

Sumuk @sumukx

2 weeks ago

@finbarrtimbers @ziv_ravid @DarioAmodei Keep in mind Mythos has been internal since late Jan, and given how strong the fable distill from Mythos was, it’s not surprising to see why that extrapolation was made especially with RSI working well

1 0 0 167 0

View Details

Sumuk @sumukx

2 weeks ago

@iamgingertrash does not hold when diverse strains in the poors are stomped out by a function overselecting for exploitation over exploration. think of the middle class like a channel between two worlds which is disappearing largely agree however, given good mobility between social strata

1 0 19 4K 3

View Details

Sumuk @sumukx

2 weeks ago

@xeophon max actually is not that much better compared to high, and it would make it look worse on the cost per run problem

1 0 3 359 0

View Details

Sumuk @sumukx

3 weeks ago

@zephyr_z9 I mean wasn’t this also what ChatGPT was initially, a “research effort”?

1 0 40 6K 1

View Details

Sumuk @sumukx

3 weeks ago

My personal goalpost for AGI: the day a critical mass of people confidently hand over all their cash to the machine god so that it can gamble in the capital markets

Nucleus☕️ @EsotericCofe

3 weeks ago

After giving Codex $1,000 it is now up 12% Should I just give it all of my money?

10 1 78 9K 20

0 0 1 103 1

View Details

Sumuk @sumukx

3 weeks ago

@packyM Yes, but what about “a vague problem someone else picked” gives it away? Having stared at codex / Claude code output I can tell too but I don’t really understand why

3 0 4 1K 0

View Details

Sumuk @sumukx

3 weeks ago

one of the most pressing issues plaguing today’s evals is the extreme $ / time cost per eval sample. Needing an entire agent rollout seems extremely slow, especially when you’re trying to use said eval as a signal for posttraining or dev splits for GEPA style setups (cc: @lateinteraction) I don’t know if there’s a solution to this, but it seems desperately needed I like to think of it from the perspective of human interviews. Interviewing for L9 doesn’t take exponentially more time than L3. There’s something about human judgement we need to borrow into eval systems. Have we ever explored adversarially adaptive evals well?

0 0 0 113 0

View Details

Sumuk @sumukx

3 weeks ago

@kurissuuu @interaction Same, I tried asking it for info to set up better flows, and it got right into insulting me for the subscription price I pay lol

0 0 0 2K 0

View Details

Sumuk @sumukx

3 weeks ago

@creatine_cycle aren't these the same role?

0 0 0 69 0

View Details

Sumuk @sumukx

3 weeks ago

@giffmana did ant have a really big high quality semi-manual re-write in 2023 that they never bothered to redo is that why claude permanently has boomer-brain?

0 0 1 89 0

View Details

Sumuk @sumukx

3 weeks ago

@xeophon @moyix did you not know about this lol

1 0 2 162 0

View Details

Sumuk @sumukx

3 weeks ago

@francoisfleuret I once had an experiment where I autocompacted every 3 turns and results were pretty good. ESP for non fable models

0 0 0 660 0

View Details

Sumuk @sumukx

3 weeks ago

@beffjezos yea.. then again, you need a house to live in, a pet to be happy, 2 vacations a year, a bunch of food to sustain you and your performance is highly based on your current mood and biological condition + you can get sick and tired silicon is upon us and it’s upon us to merge