The RL is done on problems with verifiable answers. I’m not sure how o1 slop wou...

		valine on Jan 30, 2025 \| parent \| context \| favorite \| on: OpenAI says it has evidence DeepSeek used its mode... The RL is done on problems with verifiable answers. I’m not sure how o1 slop would be at all useful in that respect.