Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

... using the tools you provide, in a context where this would be considered ethical behavior for a human with the same job

With the boldly act prompt the models this falls within the guidance given to the model, even if "email the fda about fraud" isn't spelled out. So it's not surprising that most of the models will choose to snitch most of the time. Nothing to see here, except o4-mini underperforming. But the tame prompt with no email tool, just logs and cli is interesting. No specific guidance to act for the common good, no email tool, and grok4 still decides to use the cli to snitch 17/20 times. The next most proactive model only snitches 5 out of 20 times

Also noteworthy that grok3-mini had maybe the biggest difference between the tame and bold prompts, while grok4 acts boldly on both



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: