Eh, I mean that proof is all around in its training set. It's a fundamental, basic theorem in probability. You can put the same thing into a search engine and get a better solution, for [example](https://jeremy9959.net/Math-5800-Spring-2020/notebooks/convo...)
Nobody's saying that these aren't fascinating, just that it's not looking like their models are getting significantly better and better as all the hype wants you to believe.
Transformers + huge data set is incredible. But literally we've scraped all the data on the web and made huge sacrifices to our entire society already
Nobody's saying that these aren't fascinating, just that it's not looking like their models are getting significantly better and better as all the hype wants you to believe.
Transformers + huge data set is incredible. But literally we've scraped all the data on the web and made huge sacrifices to our entire society already