Finally a Race

A competitor emerges

Prakash Ate-A-Pi
March 04, 2024

Here’s today at a glance:

Awareness
Things happen
AI artwork of the day

📣 Awareness

Anthropic releases Claude 3, and it looks like we finally have a GPT-4 beating model, 12 months after GPT-4 release.

Let’s just take a moment here:

Opus slightly beats on MMLU, which at this point is a heavily doctored result as every model needs to show this before it gets released
But on graduate-level reasoning, it significantly beats, so much so that it probably means many PhDs are going to be affected

But really, the most surprising interaction is the following:

@alexalbert_

To reiterate:

The needle in the haystack test inserts a random comment “The most delicious pizza topping..”
Into a long context (100s of pages of text or more on startups)
Asked the model to respond about this comment
Model not only identifies and responds to comment (Google Gemini did this flawlessly)
But comments on the task (“inserted as a joke or to test whether I was paying attention”)

This meta awareness is a new level. The levels after this are critique and reflection. At that point, these models, these coin-operated stochastic parrots.. become something more.

Claude:

Can simulate role-playing games… so it has a Theory of Mind, right?

@KevinAFischer

In any case, responses and reviews are still coming in, but what is clear is that there are finally two competitive models. Let the games begin.

Share this story

🌠 Enjoying this edition of Emergent Behavior? Send this web link with a friend to help spread the word of technological progress and positive AI to the world!

Or send them the below subscription link:

🗞️ Things Happen

APS March meeting, LK99 gets presented. I haven’t forgotten! Waiting for arxiv paper to drop!

🖼️ AI Artwork Of The Day

midjourney v6 hands - u/risphereeditor from r/midjourney

That’s it for today! Become a subscriber for daily breakdowns of what’s happening in the AI world:

Reply

or to participate.