This would explain the LLM implementing the feature in a way you didn't prefer. ...

benjiro · 2025-09-30T12:17:37 1759234657

> But this does not explain why Sonnet would deliver a broken implementation that does not work in even the most basic sense.

Depends not just on prompt but also the tooling / environment you use. Somebody using Claude Code CLI may get a totally different experience then somebody using CoPilot via VSC.

What do i mean by that? Look at how Copilot tries to save money by reading content only in small parts. Reading file X line 1-50, X line 51-100, ... And it starts working with this. Only if it finds a hint about something somewhere else, it will read in more context.

What i often see is that it misses context because it reads in so limited information and if there is no hint in your code or code doc, it will stop there. Try to run a local test on the code, passes, done... While it technically broke your application.

Example: If i tell it to refactor a API, it never checks if that API is used anywhere else because it only reads in that API code. So i need to manually add to the prompt to remind it, "the API is used in the system". And then it does its searching /... Found 5 files, Read X line 1...

And plop, good working code ... So if you know this limitation, you can go very far with a basic $10 CoPilot Claude Agent usage.

Where as a $200 Claude Code will give you a better experience out of the door, as it reads in a ton more. The same applies to GPT-5/Codex, what seems to be more willing to read in larger context of your project, thus resulting in less incomplete code.

This is just anecdotal from my point of view, but like with any LLM, hinting matters a lot. Its less about writing a full prompt with a ton of text but just including the right "do not forget about function name X, and module Y, and test Z". And Claude loves it hints on CoPilot because of that limited reading.