I am using more Claude.ai these days, but the limitations for paying accounts do...

accrual · on Dec 5, 2024

> I find it a terrible business practice to be completely opaque and vague about limits. Even worse, the limits seem to be dynamic and change all the time.

Here are some things I've noticed about this, at least in the "free" tier web models since that's all I typically need.

* ChatGPT has never denied a response but I notice the output slows down during increased demand. I'd rather have a good quality response that takes longer than no response. After reaching the limit, the model quality is reduced and there's a message indicating when you can resume using the better model.

* Claude will pop-up messages like "due to unexpected demand..." and will either downgrade to Haiku or reject the request altogether. I've even observed Claude yanking responses back, it will be mid-way through a function and it just disappears and asks to try again later. Like ChatGPT, eventually there's a message about your quota freeing up at a later time.

* Copilot, at least the free tier found on Bing, at least tells you how many responses you can expect in the form of a "1/20" status text. I rarely use Copilot or Bing but it demonstrates it's totally possible to show this kind of status to the user - ChatGPT and Claude just prefer to slow down, drop model size, or reject the request.

It makes sense that the limits are dynamic though. The services likely have a somewhat fixed capacity but demand will ebb and flow, so it makes sense to expand/contact availability on free tiers and perhaps paid tiers as well.

KTibow · on Dec 6, 2024

I believe the "1/20" indicator on Copilot was added back when it was unhinged to try to prevent users from getting it to act up, and it has been removed in the latest redesign

sdwr · on Dec 5, 2024

If you go through the API (with chatGPT at least), you pay per request and are never limited. I personally hate the feeling of being nickeled-and-dimed, but it might be what you are looking for.

adastra22 · on Dec 5, 2024

It’s insane to me that they don’t have a “pay $10 to have this temporary limit lifted” micro transaction model. They are leaving money on the table.

treme · on Dec 5, 2024

they are optimizing for new accounts/market share over short term rev

adastra22 · on Dec 5, 2024

Which pushes customers to other services when they are unable to provide.

eknkc · on Dec 5, 2024

They seem to lack capacity at the moment though

adastra22 · on Dec 5, 2024

Which price discovery tools would fix.

anticensor · on Dec 6, 2024

No, it's energy bound.

tiahura · on Dec 5, 2024

Or the reverse, slow reasoning.

extr · on Dec 5, 2024

Yeah it's crazy to me you can't just 10x your price to 10x your usage (since you could kind of do this manually by creating more accounts). I would easily pay $200/month for 10x usage - especially now with MCP servers where Claude Desktop + vanilla VS Code is arguably more effective than Cursor/Windsurf.

dennisy · on Dec 5, 2024

Oh very intriguing! Could you please elaborate how you are using MCP servers with VS code for coding?

extr · on Dec 6, 2024

Personally I'm using the Filesystem server along with the mcp server called wcgw[0] that provides a FileEdit action. I use MacWhisper[1] to dictate. I use `tree` to give Claude a map of the directory I'm interested in editing. I usually opt to run terminal commands myself for better control though wcgw does that too. I keep the repo open in a Cursor/Windsurf window for other edits I need.

But other than that I basically just tell the model what I want to do and it does it, lol. I like the Claude Desktop App interface better than trying to do things in Cursor/Windsurf directly, I like the ability to organize prompts/conversations in terms of projects and easily include context. I also honestly just have a funny feeling that the Claude web app often performs better than the API responses I get from the IDEs.

[0] https://github.com/rusiaaman/wcgw

[1] https://goodsnooze.gumroad.com/l/macwhisper

rahimnathwani · on Dec 5, 2024

Just use the Filesystem MCP Server, and give it access to the repo you're working on:

https://github.com/modelcontextprotocol/servers/tree/main/sr...

This way you will still be in control of commits and pushes.

So far I've used this to understand parts of a code base, and to make edits to a folder of markdown files.

trees101 · on Dec 5, 2024

how is that better than AI Coding tools? They do more sophisticated things such as creating compressed representations of the code that fit better into the context window. E.g https://aider.chat/docs/repomap.html.

Also they can use multiple models for different tasks, Cursor does this, so can Aider: https://aider.chat/2024/09/26/architect.html

extr · on Dec 6, 2024

I have never found embeddings to be that helpful, or context beyond 30-50K tokens to be used well by the models. I think I get better results by providing only the context I know for sure is relevant, and explaining why I'm providing it. Perhaps if you have a bunch of boilerplate documentation that you need to pattern-match on it can be helpful, but generally I try to only give the models tasks that can be contextualized by < 15-20 medium code files or pages of documentation.

rahimnathwani · on Dec 5, 2024

I answered a comment asking how to do it.

I didn't say it was better!

trees101 · on Dec 5, 2024

fair point