For coding I don’t use any of the previous gen models anymore.
Ideally I would have both fast and SOTA; if I would have to pick one I’d go with SOTA.
There a report by OpenRouter on what folks tend to pay for it; it generally is SOTA in the coding domain. Folks are still paying a premium for them today.
There is a question if there is a bar where coding models are “good enough”; for myself I always want smarter / SOTA.
FWIW coding is one of the largest usages for LLM's where SOTA quality matters.
I think the bar for when coding models are "good enough" will be a tradeoff between performance and price. I could be using Cerebras Code and saving $50 a month, but Opus 4.5 is fast enough and I value the piece-of-mind I have knowing it's quality is higher than Cerebras' open source models to spend the extra money. It might take a while for this gap to close, and what is considered "good enough" will be different for every developer, but certainly this gap cannot exist forever.
I just use a mix of Cerebras Code for lots of fast/simpler edits and refactoring and Codex or Claude Code for more complex debugging or planning and implementing new features, works pretty well. Then again, I move around so many tokens that doing everything with just one provider would need either their top of the line subscriptions or paying a lot per-token some months. And then there's the thing that a single model (even SOTA) can never solve all problems, sometimes I also need to pull out Gemini (3 is especially good) or others.
I’m basically only using the Codex CLI now. I switched around the GPT-5 timeframe because it was reliably solving some gnarly OpenTelemetry problems that Claude Code kept getting stuck on.
They feel like different coworker archetypes. Codex often does better end-to-end (plan + code in one pass). Claude Code can be less consistent on the planning step, but once you give it a solid plan it’s stellar at implementation.
I probably do better with Codex mostly due to familiarity; I’ve learned how it “thinks” and how to prompt it effectively. Opus 4.5 felt awkward for me for the same reason: I’m used to the GPT-5.x / Codex interaction style. Co-workers are the inverse, they adore Opus 4.5 and feel Codex is weird.
The roller mop vacuum are getting incredibly good; that is in the last year also.
Just got a Mova z60, it's shocking how much progress has been made even in the last 5 years compared to my old lidar Roborock. The z60 can even hurdle over small barriers.
Ideally I would have both fast and SOTA; if I would have to pick one I’d go with SOTA.
There a report by OpenRouter on what folks tend to pay for it; it generally is SOTA in the coding domain. Folks are still paying a premium for them today.
There is a question if there is a bar where coding models are “good enough”; for myself I always want smarter / SOTA.
reply