- Emergent Behavior
- Posts
- Finally a Race
Finally a Race
A competitor emerges
đź”· Subscribe to get breakdowns of the most important developments in AI in your inbox every morning.
Here’s today at a glance:
đź“Ł Awareness
Anthropic releases Claude 3, and it looks like we finally have a GPT-4 beating model, 12 months after GPT-4 release.
Let’s just take a moment here:
Opus slightly beats on MMLU, which at this point is a heavily doctored result as every model needs to show this before it gets released
But on graduate-level reasoning, it significantly beats, so much so that it probably means many PhDs are going to be affected
But really, the most surprising interaction is the following:
@alexalbert_
To reiterate:
The needle in the haystack test inserts a random comment “The most delicious pizza topping..”
Into a long context (100s of pages of text or more on startups)
Asked the model to respond about this comment
Model not only identifies and responds to comment (Google Gemini did this flawlessly)
But comments on the task (“inserted as a joke or to test whether I was paying attention”)
This meta awareness is a new level. The levels after this are critique and reflection. At that point, these models, these coin-operated stochastic parrots.. become something more.
Claude:
Can simulate role-playing games… so it has a Theory of Mind, right?
@KevinAFischer
In any case, responses and reviews are still coming in, but what is clear is that there are finally two competitive models. Let the games begin.
🌠Enjoying this edition of Emergent Behavior? Send this web link with a friend to help spread the word of technological progress and positive AI to the world!
Or send them the below subscription link:
🗞️ Things Happen
APS March meeting, LK99 gets presented. I haven’t forgotten! Waiting for arxiv paper to drop!
🖼️ AI Artwork Of The Day
midjourney v6 hands - u/risphereeditor from r/midjourney
That’s it for today! Become a subscriber for daily breakdowns of what’s happening in the AI world:
Reply