Even if that intuition is correct and they can fix the vending machine with a bit more data and RLHF, I fail to see where the "super" comes in here. How the fk are they going to get superintelligent training data? A time machine?
You're getting at a deep point of disagreement - should we expect a modern or near-future LLM to be limited by the intelligence of the people who generated its training data? I don't think anyone claims to have a provably correct answer. There's one intuition that says yes (why should it be impossible to make new insights from data collected by people who didn't have those insights?) and another that says no (how can a statistical average of N people's most likely responses be smarter than any of those N?)