Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't care about the benchmarks. I care about how helpful coding agents are for my work. And I can barely tell the difference between the models this year and the models last year. Everyone's raving about Opus but I bet about 50% of people would be able to identify it in a blind test against Sonnet.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: