post trained models strongly inclined to pass response similar to what got them ...

		qustrolabe 4 months ago \| parent \| context \| favorite \| on: OpenAI may not use lyrics without license, German ... post trained models strongly inclined to pass response similar to what got them high RL score, it's slightly wrong to keep thinking of LLMs as just next token predictions from dataset's probability distribution like it's some Markov Chain