Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

rlhf: Reinforcement learning from human feedback


How is this pronounced out loud?


I was just saving folks a google, as I had no idea what the acronym was.

I propose rill-hiff until someone who actually know what they’re doing shows up!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: