Eh, I mean that proof is all around in its training set. It's a fundamental, bas...

Eh, I mean that proof is all around in its training set. It's a fundamental, basic theorem in probability. You can put the same thing into a search engine and get a better solution, for [example](https://jeremy9959.net/Math-5800-Spring-2020/notebooks/convo...)

Nobody's saying that these aren't fascinating, just that it's not looking like their models are getting significantly better and better as all the hype wants you to believe.

Transformers + huge data set is incredible. But literally we've scraped all the data on the web and made huge sacrifices to our entire society already