So, the hidden mental model that the OP is expressing and failed to elucidate on is that llm’s can be thought of as compressing related concepts into approximately orthogonal subspaces of the vector space that is upper bounded by the superposition of all of their weights. Since training has the effect of compressing knowledge into subspaces, a necessary corollary of that fact is that there are now regions within the vector space that contain nothing very much. Those are the valleys that need to be tunneled through, ie the model needs to activate disparate regions of its knowledge manifold simultaneously, which, seems like it might be difficult to do. I’m not sure if this is a good way of looking at things though, because inference isn’t topology and I’m not sure that abstract reasoning can be reduced down to finding ways to connect concepts that have been learned in isolation.
Sometimes things that look very different actually are represented with similar vectors in latent space.
When that happens to us it "feels like" intuition; something you can't really put a finger on and might require work to put into a form that can be transferred to another human that has a different mental model
Yes, that also happens, for example when someone first said natural disasters are not triggered by offending gods. It is all about making explanations as simple as possible but no simpler.
Not the OP, but my interpretation here is that if you model the replies as some point in a vector space, assuming points from a given domain cluster close to each other, replies that span two domains need to "tunnel" between these two spaces.
I hadn't planned on spending my evening googling the pay grade of government officials, calculating the time taken to change a font on Microsoft Word and extrapolating that over a year.
That's because a lot of commenters here are not hackers in any real sense; rather, they're software engineers. Perhaps this hasn't always been the case.
Your argument fails right here because you're supposing something that isn't true. LLMs are better than search engines for some things, but you're speaking as if they're a replacement for what came before. They're absolutely not. Reading books — going to the original source rather than relying on a stochastic facsimile — is never going to go away, even if some of us are too lazy to ever do so. Their loss.
Put another way: leaving aside non-practical aspects of the experience, the car does a better job of getting you from A to B than a horse does. An LLM does not 'do a better job' than a book. Maybe in some cases it's more useful, but it's simply not a replacement. Perhaps a combination is best: use the LLM to interpolate and find your way around the literature, and then go and hunt down the real source material. The same cannot be said of the car/horse comparison.
...good question. This (standard) excuse is designed to make you feel bad for potentially insulting someone trying their hardest, but it doesn't make any sense.
But flying machines are well defined, or at least it's easily possible to come up with a good definition. 'A machine capable of transporting a person from a to b without touching the ground at any point in between', or whatever.
Well there are paper darts and weather balloons but most people were interested in a powered machine to transport people. Likewise with AGI but I'm guessing most people are thinking of something that can do what people do?
People did genuinely struggle to define "useful flying machine", which is why you see the description of the Wright Brother's flight come with so much detail: "first controlled, sustained flight of a powered airplane".
reply