How AI Knows Things That No One Told It [View all]
https://www.scientificamerican.com/article/how-ai-knows-things-no-one-told-it/
Lightly edited for brevity:
That GPT and other AI systems perform tasks they were not trained to do, giving them emergent abilities, has surprised even researchers who have been generally skeptical about the hype over Large Language Models. I dont know how theyre doing it or if they could do it more generally the way humans dobut theyve challenged my views, says Melanie Mitchell, an AI researcher at the Santa Fe Institute.
It is certainly much more than a stochastic parrot, and it certainly builds some representation of the worldalthough I do not think that it is quite like how humans build an internal world model, says Yoshua Bengio, an AI researcher at the University of Montreal.
-snip-
Researchers marvel at how much LLMs are able to learn from text. For example, Pavlick and her then Ph.D. student Roma Patel found that these networks absorb color descriptions from Internet text and construct internal representations of color. When they see the word red, they process it not just as an abstract symbol but as a concept that has certain relationship to maroon, crimson, fuchsia, rust, and so on. Demonstrating this was somewhat tricky. ...the researchers studied its response to a series of text prompts. To check whether it was merely echoing color relationships from online references, they tried misdirecting the system by telling it that red is in fact green. Rather than parroting back an incorrect answer, the systems color evaluations changed appropriately in order to maintain the correct relations.
Picking up on the idea that in order to perform its autocorrection function, the system seeks the underlying logic of its training data, machine learning researcher Sébastien Bubeck of Microsoft Research suggests that the wider the range of the data, the more general the rules the system will discover. Maybe were seeing such a huge jump because we have reached a diversity of data, which is large enough that the only underlying principle to all of it is that intelligent beings produced them, he says. And so the only way to explain all of this data is [for the model] to become intelligent.
The top unsettling quote of the article comes from a cognitive scientist and AI researcher who says that the emergent abilities of Large Language Models are indirect evidence that we are probably not that far off from Artificial General Intelligence. (If you've read Nick Bostrom's book, this will scare you because he his posits that the transition from AGI to Superintelligence will occur as an uncontrollable "explosion".)