> But applying logic or being able to observe the physical world doesn't emerge from language. Language seems like an artifact of doing these things and a tool to do them in collaboration, but it only carries logic and knowledge because humans left these traces in "correct language".
That's not the only element that went into producing the models. There's also the anthropic principle - they test them with benchmarks (that involve knowledge and truthful statements) and then don't release the ones that fail the benchmarks.
And there is Reinforcement Learning, which is essential to make models act "conversational" and coherent, right?
But I wanted to stay abstract and not go into to much detail outside my knowledge and experience.
With the GPT-2 and GPT-3 base models, you were easily able to produce "conversations" by writing fitting preludes (e.g. Interview style), but these went off the rails quickly, in often comedic ways.
Part of that surely is also due to model size.
But RILHF seems more important.
I enjoyed the rambling and even that was impressive at the time.
I guess the "anthropic principle" you are referring to works in a similar direction, although in a different way (selection, not training).
The only context in which I've heard details about selection processes post-training so far was this article about OpenAIs model updates from GPT-4o onwards, discussed earlier here:
The parts about A/B-Testing are pretty interesting.
The focus is ChatGPT as an enticing consumer product and maximizing engagement, not so much the benchmarks and usefulness of models. It briefly addresses the friction between usefulness and sycophancy though.
Anyway, it's pretty clever to use the wording "anthropic principle" here, I only knew the metaphysical usage (why do humans exist).
That's not the only element that went into producing the models. There's also the anthropic principle - they test them with benchmarks (that involve knowledge and truthful statements) and then don't release the ones that fail the benchmarks.