I think there’s a misconception about what AGI is. The point of a “smarter” model is not that it knows all the facts, that would be wasteful as it is trivial to look up facts at inference time. The point is that a “smarter” model can generalize solutions to out of distribution problems (meaning problems that are not explicitly stated in its training corpus). So AGI wouldn’t be about a model that knows everything about language and every advancement in every field, but rather a model that is better than humans at finding solutions to problems (and fetching information from outside sources when it doesn’t know enough about a field to operate a solution).
The point about context is kind of irrelevant here as training data is not part of the inference context so you “add intelligence” to a model by re-training a new one, not by cramming the context of an existing one.




Really interested to see this proof if you have a link handy. Do you have any idea why it doesn’t apply to human cognition ?