A couple of months ago I saw a paper (can't remember if published or just on arx...

A couple of months ago I saw a paper (can't remember if published or just on arxiv) in which Turing's original 3-player Imitation Game was played with a human interrogator trying to discern which of a human responder and an LLM was the human. When the LLM was a recent ChatGPT version, the human interrogator guessed it to be the human over 70% of the time; when the LLM was weaker (I think Llama 2), the human interrogator guessed it to be the human something like 54% of the time.

IOW, LLMs pass the Turing test.