Since it’s been a controversy on here a couple times, here’s a great example of how you can demonstrate an LLM can produce something it can’t have seen before.
It doesn’t prove anything beyond doubt, but I think these kinds of experiments show to something like a civil law standard that they’re not merely parrots.
They may indeed develop linguistic skills at deeper levels, but LLMs are still only playing with words. Imagine a kid who grew up confined in a library with unlimited books, but no experience of the real world outside, no experiments with moving about, bouncing balls, eating, smelling, seeing, hearing, interacting with others, only reading - might write eloquently but have no ‘common sense’ of reality. To train a real AI with physical sense and capabilities would be like bringing up a kid - messy, not easy to automate, takes a long time.
we’re still early in development, we just got surprised by how good statistical language modeling alone is
I mean, I’ve never seen a toucan, or even been close to one to my knowledge, but I feel like I understand them pretty well, at least at a basic level. I’ve also never known any bird except through smell, sound, touch, sight and so on.
They do still suck at physical tasks, and probably will until we find a totally different approach to the problem than brute gradient decent, but I’m not so sure that makes everything else they (appear to) know useless. I’m not convinced kinetic knowledge is the only kind.