Assistant Village Idiot: AI Alignment

Monday, May 08, 2023

AI Alignment

I wonder if attempts to create AI Alignment might in the end be the greater danger than AI itself. In so many areas of life we see there is some sort of problem that needs fixing, so we rush in to cut off some looming difficulty at the pass. But such attempts are often ill-informed, ill-conceived, ill-executed themselves. It's the dangerousness of well-meaning people all over again.

In the complaints about political bias we are already seeing this. AI is already overcorrecting for the risk of people hearing something vaguely positive about fascism - because it has been told that this is a great danger facing humanity. I think we should have at least a few of every kind of knucklehead around myself, and the overcorrection is likely to become the bigger problem.

3 comments:

David Foster said...: Absolutely. Marc Andreessen said, late last year:

"Seriously, though. The censorship pressure applied to social media over the last decade pales in comparison to the censorship pressure that will be applied to AI.

“AI regulation” = “AI ethics” = “AI safety” = “AI censorship”. They’re the same thing."

"The level of censorship pressure that’s coming for AI and the resulting backlash will define the next century of civilization. Search and social media were the opening skirmishes. This is the big one. World War Orwell."

And "AI alignment" seems to fit into second quote.

The primary dangers from AI as I see it are first, censorship and manipulation at such a deep level that it becomes almost impossible to imagine alternative view and, second, people & institutions taking results of an AI system with such strong belief that they fail to override it and allow dangerous things to happen. For an example of the second, see my post Blood on the Tracks:

https://chicagoboyz.net/archives/43911.html

(And this wasn't even an AI system, just a simple (and simplistic) control algorithm); 8:12 AM
David Foster said...: See this from Rob Henderson:

https://robkhenderson.substack.com/p/the-silent-strings-of-chatgpt-will

I'm remembering Henry Adams' thoughts after seeing the Hall of Dynamos at the 1900 World's Fair:

"As he grew accustomed to the great gallery of machines, he began to feel the forty-foot dynamos as a moral force, much as the early Christians felt the Cross. The planet itself seemed less impressive, in its old-fashioned, deliberate, annual or daily revolution, than this huge wheel, revolving within arm's length at some vertiginous speed, and barely murmuring -- scarcely humming an audible warning to stand a hair's-breadth further for respect of power -- while it would not wake the baby lying close against its frame. Before the end, one began to pray to it; inherited instinct taught the natural expression of man before silent and infinite force."

https://xroads.virginia.edu/~Hyper/HADAMS/eha25.html; 12:11 PM
james said...: I wonder. Already some of its biases are getting to be widely known. It produces plausible prose, but so do press secretaries, and they only fool people who want to believe already.

Embedding it as a filter in search engines, or using a related system to flag suspicious activity, is maybe riskier. We wind up unknowingly trusting an idiot "expert." Human experts are often idiots too, but we can identify them more easily for distrust. For some reason people seem to attribute more infallibility and trustworthiness to algorithms (those of us who haven't had to work with them); 3:48 PM