Hacker Newsnew | past | comments | ask | show | jobs | submit | rohtashotas's commentslogin

It's not a silly mistake. It was rlhf'd to do this intentionally.

When the results are more extremist than the unfiltered model, it's no longer a 'small mistake'


rlhf: Reinforcement learning from human feedback


How is this pronounced out loud?


I was just saving folks a google, as I had no idea what the acronym was.

I propose rill-hiff until someone who actually know what they’re doing shows up!


Realistically it was probably just how Gemini was prompted to use the image generator tool


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: