Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Do you find this to still be true with the Sonnet 4.5 model?


IMO Sonnet 4.5 is great but it just isn’t as comprehensive of a thinker. I love Anthropic and primarily use CC day to day but for any tricky problems or “high stakes, this must not have bugs” issues, I turn to Codex. I do find if you let Codex run on it its own too long it will produce comparably sloppy or lacking-in-vision type issues that people criticize Sonnet for, however.


That’s a curious approach. Why would you use both? Why not just use the more reliable dependable option for all purposes?


Sonnet 4.5/CC is faster, more direct, and is generally better at following my intent rather than the letter of my prompt. A large chunk of my tasks are not "solve this concurrency bug" or "write this entire feature" but rather "CLI ops", merging commits, running a linter, deploying a service, etc. I almost use it like it was my shell.

Also while not quite as smart, it's a better pair programmer. If I'm feeling out a new feature and am not sure how exactly it should work yet, I prefer to work with Sonnet 4.5 on it. It typically gives me more practical and realistic suggestions for my codebase. I've noticed that GPT-5 can jump right into very sophisticated solutions that, while correct, are probably not appropriate.

Sonnet 4.5: "Why don't we just poll at an interval with exponential backoff?"

GPT-5: "The correct solution is to include the data in the event stream...let us begin by refactoring the event system to support this..."

That said, if I do want to refactor the event system, I definitely want to use Codex for that.


Strangely enough this is one of the first times here I see someone with the exact same experience. GPT-5 is very prone to a style that would for most codebases be overengineering. I think as a large part of HN works on huge enterprise FAANG-like code, this is where it shines, so here it gets rave reviews of just being the best overall. But globally, for most developers, it's overengineering and adds a lot of unnecessary code to maintain. Sonnet in that sense remains "every man's coder". I've gone back from 4.5 to 4 now, having spent a good chunk of time with 4.5 it just seems like a slight overall regression with no real upsides besides being a little faster than 4.


Glad I'm not crazy, the tide right now of codex > sonnet is overwhelming. Frankly I think what most people go by is "does the code work" - codex is admittedly relentless. It's very good at producing code that works. But "does it work" is not the end-all-be-all in most cases...


I frequently have multiple coding assistants going at once—Gemini 2.5 Pro via Aider as the workhorse for most standard changes, Sonnet 4.5 via Claude Code for question answering, documentation, test case development, or broad based changes to many files in a project, then GPT-5 for more complex diagnostic or architectural type things—I don’t generally like the code it writes, but it will often be able to fix situations where the other models get stuck in some kind of local maxima.


Even inside the claude-code ecosystem, more than ever there are tradeoffs on raw speed vs intelligence vs cost.

Moving a bunch of verbose templated HTML around while watching results on a devserver? Haiku all day. It's a bonus that it's cheaper, but the real treat is its speed.

Adding a feature whose planning will involve intake of several files? Sonnet.

Working specifically on 'copy' or taste issues? Still I tend to prefer Opus here.

Individual experiences may vary!


In my experience, there isn’t a model that is more dependable for all purposes. They each have some unique strengths.


I'm like 80% sure Sonnet 4.5 is just rebranded Opus.

Sonnet 4 was a coding companion, I could see what it was doing and it did what I asked.

Sonnet 4.5 is like Opus, it generates massive amounts of "helper scripts" and "bootstrap scripts" and all kinds of useless markdown documentation files even for the tinies PoC scripts.


It's very much not, so I'm more than happy to take that bet - how much are we wagering? Have you ever used each for non-coding tasks?

The generation of helper, markdown and bootstrap scripts are very dependent on your harness.


I paid for "Claude Code", I'm not asking it for stuff about the Mesopotamian empire :)


I don't. Sonnet is faster too.


Yes. Sadly. And it really does make me sad. I was rooting for Anthropic. Still kinda am.


I have a very similar experience. I was heavily invested in Anthropic/Claude Code, and even after Sonnet 4.5, I'm finding that Codex is performing much better for my game development project.


It seems particularly good at high performance programming in low level languages.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: