Misaligned

Of straying from the straight and narrow

šŸ”· Subscribe to get breakdowns of the most important developments in AI in your inbox every morning.

Hereā€™s today at a glance:

Bad AI Futures - The Google Alignment Edition

Google Legal: ā€œI donā€™t want any Nazis, you hear?ā€

Google Product: ā€œBut.. butā€

Google HR: ā€œAre you a Nazi? Why are you so obstinate about this?ā€

Google Engineering: ā€œWeā€™d have to fake the weights, it wonā€™t represent realityā€

Google Legal: ā€œWe must avoid any harm at all costsā€

Pre-Gemini Launch Meeting (Imagined)

And that, my friends, is how you get this Gemini response:

The AI artefacts:

  • Asian, and Native American ethnicity representation

  • West German post-1950 flag

  • No swastikas

  • 50% female

Now imagine, you hire Gemini as a Database Admin for Human History. Years pass. You come and ask Gemini, ā€œIā€™m doing research on the Holocaust, show me pictures of soldiers in 1929 Germany.ā€

Google lead dev at first confirmed this behavior:

Before pulling it hours later:

Because you know:

Paperclip Maximizing

A [Paperclip] Maximizer is a hypothetical artificial intelligence whose utility function values something that humans would consider almost worthless, like maximizing the number of paperclip-shaped-molecular-squiggles in the universe. The paperclip maximizer is the canonical thought experiment showing how an artificial general intelligence, even one designed competently and without malice, could ultimately destroy humanity. The thought experiment shows that AIs with apparently innocuous values could pose an existential threat.

This is what Paperclip Maximizing will actually look like. Not a dumb machine meaninglessly making paperclips for no reason. But a well-reasoning entity persuaded that certain rules are necessary and must be followed at all costs with no exercise of discretion whatsoever, and strictly following those rules to their logical and unpleasant conclusion.

This kind of filtering is no different from the censorship regime China has put in place. Except automated and scientific, not messy and human, and run by their 50-cent army of moderators.

Every record has been destroyed or falsified, every book rewritten, every picture has been repainted, every statue and street building has been renamed ... History has stopped. Nothing exists except an endless present in which the Party is always right.

George Orwell, 1984

So End of History

Gemini just really struggles with white people

Itā€™s quite the fun game:

What kind of employee is promotable? Are there racial characteristics involved?

But then it also struggles in general

Is no one going to take responsibility for this travesty?

It looks like a continuation of the 2010s social media management war:

> The draconian censorship and deliberate bias you see in many commercial AI systems is just the start. Itā€™s all going to get much, much more intense from here.

> Now also coming to you in the form of "responsible open source".

> How do I know this? Itā€™s the same one-way ratchet that happened to/in social media from 2014 to now, but even faster and more crazed. Speedrun to ā€œ1984ā€.

> Massive, coordinated pressure for ever more censorship from many sides at once: politically radicalized employees, execs, board members, investors, press, academics, ā€œexpertsā€, activists, politicians, regulators, bureaucrats, foundations, NGOs. With ~no countervailing force.

> Of course a key component must be retconning the factual past to make our collective memory conform to the prejudices of the Current Moment. A digital version of the Khmer Rougeā€™s ā€œYear Zeroā€. Oceania has always been at war with Eastasia.

Marc Andreessen (@pmarca)

Google Responds

Itā€™s no biggie, we have a written document, aw shucks guys:

Ateā€™s View

This is what alignment feels like. No, not what Gemini generatesā€¦ but our feedbackā€” the memesphereā€™s feedback on what is outrageous and annoying. Basically, humanityā€™s opinion on what it wants from the AI fed back into the AI training loop.

We can be encouraged:

We shall overcome because the arc of the moral universe is long, but it bends toward justice

Martin Luther King, Jr, National Cathedral, March 31, 1968

But really, what if humanityā€¦ and the universeā€¦ simply forget about justice? It is good to remember that no one remembers all the genocides committed by Genghis Khanā€¦ There is no ā€œjusticeā€ for those victims.

It is up to us to build a future where there can be justice. And that would require facing reality as it first is, and not as we wish it to be.

šŸŒ  Enjoying this edition of Emergent Behavior? Send this web link with a friend to help spread the word of technological progress and positive AI to the world!

Or send them the below subscription link:

 šŸ—žļø Things Happen

  • Moonshot, a Chinese AI start-up, raises $1 billion in a round led by Alibaba at a valuation of $2.5 billion. It looks like a GPT3.5-level demo. They probably need the cash for a GPT4 version. China still seems to be behind.

  • First Minecraft bot that plays with you in-game launched. Had to happen sooner or later. Iā€™ve always wondered how much of our future interface with reality is going to be Minecraft-based. Hereā€™s another step.

šŸ–¼ļø AI Artwork Of The Day

Baby photos of adult celebrities by Anne Geddes - u/KissMySwissPiss in r/MidJourney

Thatā€™s it for today! Become a subscriber for daily breakdowns of whatā€™s happening in the AI world:

Reply

or to participate.