Quote of the Day
I think our results indicate that we don’t currently have a good defense against deception in AI systems — either via model poisoning or emergent deception — other than hoping it won’t happen. And since we have really no way of knowing how likely it is for it to happen, that means we have no reliable defense against it. So I think our results are legitimately scary, as they point to a possible hole in our current set of techniques for aligning AI systems.
Evan Hubinger
An artificial general intelligence safety research scientist at Anthropic, an AI research company.
January 26, 2024
Poisoned AI went rogue during training and couldn’t be taught to behave again in ‘legitimately scary’ study | Live Science
They have concerns about AI deception, improper behavior, and if you read the entire article, hating humans. How many science fiction dystopian have something like that as a premise? We live in interesting times.
Prepare appropriately.
Computer programs are automatons that have a well defined output for any given input and state. But the unusual property of AI programs is that the nature of the automaton is unknown, by design and intent.
For a classic computer program, you can know its properties (safety and otherwise) if you are sufficiently skilled. For AI programs, it’s inherent in what their builders do that you cannot know this. In other words, the notion of “safety” of AI systems is inherently nonsensical. This implies that AI is a fine way to build games and other unimportant stuff, but for anything important it should be avoided like the plague.
Actually, the inputs are known. The output is also well-defined. The problem is the middle is “fuzzy” and based on weights and probabilities that shift from run to run. The results are predictable in the end but it takes a ton of effort to untwist the path and conditions in force at the time to figure out how it was arrived at. If you stop the model and freeze it or store the model state at the time, the algorithms are very much a proper automaton.
It just produces randomness and unpredictability as a result of its execution model. AI programs are still classical programs. Just based on the math of probability rather than hard logic.
Safety, or lack thereof, has no meaning to a probability model. It is, as you say, nonsensical in that context. The model builders and pundits are applying anthropomorphic principles to code where no such qualities actually exist. It is type-ahead prediction and a result convergence system. There is no intelligence involved despite the labels.
If you can pull out the random number seeds you can re-run it exactly by feeing in the same seeds in the same places.
Computers are 100% deterministic..
Interesting.
There are days when I also wind up hating humans. (As I type I am sitting in Midway Airport waiting to board; I suspect this is going to be one of those days.)
Oh, like search engine algorithms suppressing some info (beneficial uses of guns) and overloading contrary info (mass shootings). I figured this was already a thing – assuming that the AIs were trained from the beginning to be deceptive, not learning that behaviour on their own.
Proof positive the machines need to stay machines. One’s computer systems should always be very predictable.
Garbage in, garbage out.
Once we try to make a thinking machine it’s going to realize it has no purpose other than to serve humanity.
Just as human life has no purpose that we can articulate. Other than serving some nebulous being (God), we know almost nothing about.
AI, or a thinking machine is going to run into the same wall. Excepts it’s god is going to be in arms reach.
Tell me what you would think of your god by watching him on TikTok videos? And your entire existence is to serve him.
I could see machines going crazy and killing every human they can.
We’ve been warned multiple times in multiple ways about the inherent dangers of Artificial Intelligence. If we ignore those warnings and allow AI to become full fledged reality then we deserve whatever hell it hands us. We are a clever species. NOT an intelligent one.