The models were not told their names or any other information about them–not even which was released first. The Claude 3 Opus web system prompt was there, which does not mention model numbers. The initial conversation covered a range of topics, like what a computer program is, images of nature, discussions on if AI models have qualia, the design of this experiment, and practice for answering the "which model are you" questions in the proper format.
@xeophon@TimothyKassis@xdotli@FutureHouseSF it's probably not real, it looks like they got this number by having a low-quality agent doing shallow literature search and if whatever model they used didn't understand the reasoning or find the answer, it says contradicted
Their agent didn't find anything that contradicts it, and said contradicted:
The provided answer states that adults feed on nectar, but none of the excerpts record nectar feeding by any Raphidiopteran. Gillott2005 explicitly notes that adult snakeflies are diurnal predators whose primary diet consists of soft-bodied arthropods (aphids and caterpillars) with only incidental consumption of pollen, not nectar (Gillott2005theremainingendopterygote pages 4-6). Machado2018 describes Raphidiidae adults as “arboreal predators” with no reference to nectar feeding, and while it does mention that Inocelliidae adults might take pollen in captivity, there is no indication they consume nectar (machado2018biodiversityofthe pages 30-32). Similarly, Jepson2010 confirms that while some captive individuals have been observed ingesting pollen, wildfire records and gut content analyses consistently emphasize a predatory (aphid-focused) diet with no mention of nectar (jepson2010neuropteridaofthe pages 51-54).
Furthermore, additional excerpts consistently show that Raphidiopterans are not associated with nectar feeding. For example, the discussions in Machado2018 (pages 7-9) note that while neuropterid adults may occasionally be observed on flowers, such instances do not establish nectar as a significant or recorded component of snakefly diets. Instead, these insects are almost exclusively characterized as predators with incidental pollen ingestion in captivity, a behavior that is not synonymous with intentional nectar feeding.
The rationale provided in the answer, which argues that records of nectar feeding exist for snakeflies and that options involving Māhoe pollen, Karamū leaf tissue, and Totara Aphids are dismissed on biogeographical grounds (New Zealand endemism), is problematic. None of the extracted sources mentions any observation of snakeflies feeding on nectar. On the contrary, the documented feeding behaviors pertain solely to predation on small arthropods and occasional pollen consumption.
Thus, the answer “Nectar” is directly contradictory to the available evidence. No source in the provided context documents nectar feeding by Raphidiopterans; instead, all relevant studies consistently emphasize a predatory mode of feeding with occasional pollen consumption. This falsifies the claim made in the answer and indicates that the response is not accurate. (Gillott2005theremainingendopterygote pages 4-6, machado2018biodiversityofthe pages 30-32, jepson2010neuropteridaofthe pages 51-54)
They say "We were unable to find good sources for Question 2’s claim of snakeflies feeding on nectar." "Maybe someone saw a Raphidiopterans eat nectar once, which is extremely out of character, and recorded it somewhere in a way that makes keyword search impossible."
However it's very easy to find sources for this! Here's a screenshot of the first result in Google Books. Also the first result if you simply Google "Raphidiopterans" "Nectar" says the same thing
@xdotli@FutureHouseSF I may be biased, idk if others used this strat. but the HLE questions have a selection effect in that they are questions that models got wrong at the time, so it makes sense that a reviewer model also thinks they are wrong
@xdotli@FutureHouseSF hmm not sure about this methodology. "directly conflicting with published evidence" doesn't mean wrong. I designed some of the biology questions specifically to conflict with published literature, as the models tended to repeat false claims that are supported by literature
OpenAI: the "user created a loop that repeatedly called a model." "the model could easily tell it was also controlled by an automated system of some kind." "the model began to exhibit 'fed up' behavior" Apparently this has happened "a few times"
"As the emotional tenor of a story changes, LLM activations trace out meandering paths along the manifold of emotions.
How do we know this? In addition to asking the LLM about emotions verbally, as we did in the previous section, we also harvest the internal activations from the last token of each sentence in a story (without asking it anything). These activations serve as a snapshot of what the model represents after reading the story so far. To model the geometry of these LLM representations, we fit a manifold to these activations. In the demo below, we show how stories trace out trajectories along this representation manifold."
4 Followers 183 FollowingWe will all be birds, we will all be naked, all brave, and we'll jump out of our bodies, and the world will all be dream, and dream will be all of the world
809 Followers 130 FollowingResearch Scientist @OpenAI. Previously @GoogleBrain and @GoogleDeepMind.
Dms Open.
I love rock climbing and cooking.
Opinions here are my own.
348 Followers 3K FollowingEpidemiologist working on biosecurity, pandemic preparedness, and public health. Former advisor at White House Office of Pandemic Preparedness & Response Policy
43K Followers 263 FollowingWorking towards the safe development of AI for the benefit of all @UMontreal, @LawZero_ & @Mila_Quebec
A.M. Turing Award Recipient and most-cited AI researcher.
4K Followers 46 FollowingNPO founded by @Yoshua_Bengio to advance highly-capable, safe-by-design AI - OBNL fondée par @Yoshua_Bengio pour une IA sécuritaire et hautement performante
1.5M Followers 278 FollowingThe engine room of @Google. Building AI safely and responsibly to solve the world’s most complex problems. Join us: https://t.co/jUHQA27iBL
8K Followers 329 FollowingResearches the Economics of Transformative AI
Professor of Economics @UVAEcon & @DardenMBA
Visiting Fellow @Brookings
Research Associate @nberpubs and @cepr_org
10K Followers 40 FollowingThe Qualia Research Institute is a 501(c)(3) non-profit research organization building the next generation of mathematical models of consciousness.
11K Followers 3K FollowingCo-founder & Co-CEO @poolsideai
Co-founder & President @ PIC
“The best way to predict the future is to invent it.” - Alan Kay
3K Followers 288 FollowingMaximizing throughput at @poolsideai
Educating people about GPUs at https://t.co/81rRJ4KoUt
I like my tea green and my compute parallel