ok so you'll have to help me here, I'm still learning this stuff.
RLHF I looked it up. Is this really useful? The average human has zero general expertise because people are specialized (I know nothing about say, 1960s avant garde french cinema and my responses in a conversation there would be garbage - given the breadth of human knowledge even the most accomplished scholars are useless for over 99% of it). Won't there be a quality decrease? How is this accommodated for?
If the chat systems simply gave the most popular answers it would cease to be useful real fast.
RLHF I looked it up. Is this really useful? The average human has zero general expertise because people are specialized (I know nothing about say, 1960s avant garde french cinema and my responses in a conversation there would be garbage - given the breadth of human knowledge even the most accomplished scholars are useless for over 99% of it). Won't there be a quality decrease? How is this accommodated for?
If the chat systems simply gave the most popular answers it would cease to be useful real fast.