Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Modern LLMs are also "essentially only trained to predict the next word, based on what they've seen on the internet."


Well, sort of! The new ones all "sound like AI." They're not great for writing, especially in terms of style and tone.

I'm not sure exactly how that works, but apparently instruct tuning produces mode collapse, whereas the older models are basically "raw unfiltered internet" (for better and worse).

I've been playing around with Davinci and it's remakable how much better the writing is.

You have to prompt it right, because all it does is continue the text. (E.g. if you ask it a question, it might respond with more questions.) But I think we really lost something valuable when we started neglecting the base/text models.


Until they've undergone preference tuning and RL post-training (which is expected to start using more training compute than next-token-prediction pre-training).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: