I was really hoping someone would reproduce the tests and validate the claims fr...

I was really hoping someone would reproduce the tests and validate the claims from Microsoft's "Sparks of AGI" paper [1]. This video just does that (and more).

Let me provide one imporant quote from the video (from 18:50):

"I personally reproduced every single example from Microsoft, and while all the capabilities of GPT-4 were not necessarily over exaggerated, but the difference between GPT-3.5 and GPT-4 does feel a bit over-inflated [...] one of my quibbles with the Microsoft paper: they give the impression that GPT-4 is an even bigger step for AI than I think is realistically true."

Kudos!

[1] https://arxiv.org/abs/2303.12712