Hacker Newsnew | past | comments | ask | show | jobs | submit | chillfox's commentslogin

From my observation, people who use the api either end up learning to be much more token efficient, or use a cheaper model.

I have been using the API for the last 2 years, OpenRouter for personal projects and Claude API for work, most of it in zed, always on high thinking. For work I usually spend $25 if I use opus/sonnet all day, and for personal stuff I usually spend $2-$5 if I use sonnet for a full evening.

But, I don’t think someone who’s used to not thinking about token cost and efficient use would get anywhere close to that low spending if they switched from a plan to the API.


I got a claude api key from work that I use with zed. It works really well and usually ends up costing $25/day if I use it all day.

I did recently experiment with copilot for personal stuff, but will cancel it now that opus is no longer available on the $10/month plan.


Yes!

And yet, the observable evidence of changes in software that collect metrics directly contradict this.

Be very careful with that.

Analytics driven development easily leads to bad outcomes. 1. Important, but less frequently used feature gets moved to a hidden spot leading to even less usage leading to eventual removal. 2. Poorly functioning features not getting the improvement they need because few use them due to how poorly they function.

I have seen these patterns a lot in software where decisions are based on analytics, and I usually stop using that sofware when I find a replacement.


I don't know what they have done to Claude, but when using through copilot it's truly awful compared to using it straight from the API.

I have always just used the API, but I decided to give copilot a go on the weekend because of the cheap price. And I am seeing weird behavior like I have never seen before... It will somehow fail to use the file editing tool and then spend an absolutely huge amount of time/tokens building a python script to apply the edit in a sub process... And it will spin it's wheels on stuff the API routinely just gets right in one shot.


This might have been bad timing. Copilot API broke things last weekend with caused a lot of tool calls in various agent harnesses to start failing like the edit tool.

Example zed issue https://github.com/zed-industries/zed/issues/54219?issue=zed...


Managing context size and efficient token usage is a skill.

I have an Anthropic API key for work, and if I use sonnet/opus all day for agent coding, it ends up costing about ~$25.

I am going to need more cpu/ram to run multiple agents in parallel to spend much more than that.


GLM is the first open source model that actually worked for me, where I found the output ok.

And yes, sonnet/opus is better and what I use daily. But I wouldn’t be that upset if I had to drop down to GLM.


I have never seen a model be “lazy” before (I have seen them go for minimal change). I have been using the models through the api with various agents and no custom system prompt.

So I am curious, how do people get these lazy outputs?

Is it by having one of those custom system prompts that basically tells the model to be disrespectful?

Or is it free tier?

Cheap plans?


I have seen some people complain about a new tendency where it can suggest wrapping up the current task even though it isn't done yet. I haven't seen it myself though.

Usually this gets worse if you have a phrase like "wrap it up" earlier in the output, or if you're at a few hundred thousand tokens without compacting.

In both cases the fix is really simple, just compact.


Pretty sure it’s a harness or system prompt issue.

I have never seen those “minimal change” issues when using zed, but have seen them in claude code and aider. Been using sonnet/opus high thinking with the api in all the agents I have tested/used.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: