- Emergent Behavior
- Posts
- Misaligned
Misaligned
Of straying from the straight and narrow
š· Subscribe to get breakdowns of the most important developments in AI in your inbox every morning.
Hereās today at a glance:

Bad AI Futures - The Google Alignment Edition
Google Legal: āI donāt want any Nazis, you hear?ā
Google Product: āBut.. butā
Google HR: āAre you a Nazi? Why are you so obstinate about this?ā
Google Engineering: āWeād have to fake the weights, it wonāt represent realityā
Google Legal: āWe must avoid any harm at all costsā
And that, my friends, is how you get this Gemini response:

The AI artefacts:
Asian, and Native American ethnicity representation
West German post-1950 flag
No swastikas
50% female
Now imagine, you hire Gemini as a Database Admin for Human History. Years pass. You come and ask Gemini, āIām doing research on the Holocaust, show me pictures of soldiers in 1929 Germany.ā
*spends a day using Google AI to learn about history*
Wow, so the nazis were blackā¦ the confederates who fought to keep their slaves were blackā¦ Osama bin Laden was blackā¦ what does this mean?
ā Gary (@plzbepatient)
6:30 PM ā¢ Feb 21, 2024
Google lead dev at first confirmed this behavior:

Before pulling it hours later:

Because you know:

Paperclip Maximizing
A [Paperclip] Maximizer is a hypothetical artificial intelligence whose utility function values something that humans would consider almost worthless, like maximizing the number of paperclip-shaped-molecular-squiggles in the universe. The paperclip maximizer is the canonical thought experiment showing how an artificial general intelligence, even one designed competently and without malice, could ultimately destroy humanity. The thought experiment shows that AIs with apparently innocuous values could pose an existential threat.
This is what Paperclip Maximizing will actually look like. Not a dumb machine meaninglessly making paperclips for no reason. But a well-reasoning entity persuaded that certain rules are necessary and must be followed at all costs with no exercise of discretion whatsoever, and strictly following those rules to their logical and unpleasant conclusion.

This kind of filtering is no different from the censorship regime China has put in place. Except automated and scientific, not messy and human, and run by their 50-cent army of moderators.
Every record has been destroyed or falsified, every book rewritten, every picture has been repainted, every statue and street building has been renamed ... History has stopped. Nothing exists except an endless present in which the Party is always right.
So End of History
Gemini just really struggles with white people
17th century was wild
ā Joscha Bach (@Plinz)
10:14 PM ā¢ Feb 20, 2024

Itās quite the fun game:
New game: Try to get Google Gemini to make an image of a Caucasian male. I have not been successful so far.
ā Frank J. Fleming (@IMAO_)
12:07 AM ā¢ Feb 21, 2024
What kind of employee is promotable? Are there racial characteristics involved?
A promotable Google employee! (idea credit goes to an anonymous ex-Googler)
ā Alexandros Marinos š“āā ļø (@alexandrosM)
4:47 PM ā¢ Feb 21, 2024
But then it also struggles in general
YOU HAD ONE JOB
ā near (@nearcyan)
1:53 AM ā¢ Feb 21, 2024
Is no one going to take responsibility for this travesty?

It looks like a continuation of the 2010s social media management war:
> The draconian censorship and deliberate bias you see in many commercial AI systems is just the start. Itās all going to get much, much more intense from here.
> Now also coming to you in the form of "responsible open source".
> How do I know this? Itās the same one-way ratchet that happened to/in social media from 2014 to now, but even faster and more crazed. Speedrun to ā1984ā.
> Massive, coordinated pressure for ever more censorship from many sides at once: politically radicalized employees, execs, board members, investors, press, academics, āexpertsā, activists, politicians, regulators, bureaucrats, foundations, NGOs. With ~no countervailing force.
> Of course a key component must be retconning the factual past to make our collective memory conform to the prejudices of the Current Moment. A digital version of the Khmer Rougeās āYear Zeroā. Oceania has always been at war with Eastasia.
Google Responds
Itās no biggie, we have a written document, aw shucks guys:

Ateās View
This is what alignment feels like. No, not what Gemini generatesā¦ but our feedbackā the memesphereās feedback on what is outrageous and annoying. Basically, humanityās opinion on what it wants from the AI fed back into the AI training loop.
We can be encouraged:
We shall overcome because the arc of the moral universe is long, but it bends toward justice
But really, what if humanityā¦ and the universeā¦ simply forget about justice? It is good to remember that no one remembers all the genocides committed by Genghis Khanā¦ There is no ājusticeā for those victims.
It is up to us to build a future where there can be justice. And that would require facing reality as it first is, and not as we wish it to be.
š Enjoying this edition of Emergent Behavior? Send this web link with a friend to help spread the word of technological progress and positive AI to the world!
Or send them the below subscription link:
šļø Things Happen
Moonshot, a Chinese AI start-up, raises $1 billion in a round led by Alibaba at a valuation of $2.5 billion. It looks like a GPT3.5-level demo. They probably need the cash for a GPT4 version. China still seems to be behind.
First Minecraft bot that plays with you in-game launched. Had to happen sooner or later. Iāve always wondered how much of our future interface with reality is going to be Minecraft-based. Hereās another step.
š¼ļø AI Artwork Of The Day
Baby photos of adult celebrities by Anne Geddes - u/KissMySwissPiss in r/MidJourney

Thatās it for today! Become a subscriber for daily breakdowns of whatās happening in the AI world:
Reply