Hacker Newsnew | past | comments | ask | show | jobs | submit | swalsh's commentslogin

I'd assume it's not up to par with Qwen-3.5 then, which has been distilling Claude, and the quality of the model is probably a direct result of that.

Its amazing how quickly ive just become accustomed to being a max subscriber. I dont think I could go back to pro.

Then max+, then ultra, then ultra pro

As long as they provide the same utility / $ I don’t see why not. It’s not like the open weight models are that far behind and Claude code itself shouldn’t be very hard for the commmunity to replicate if Anthropic start acting up too much.

Open source models, especially qwen are pretty dang good. But its not opus 4.6, the evals dont tell the full story. I question the assumption open source models are 3-6 months out.


Its not just about the quality of output, but you also can finetune them to proprietary needs, if the skillsets are their internally, to make them better without governance risks. So being SOTA doesn't matter as much, since generalized tasks are not what matter most to companies, its the specialization relative to business need or internal datasets.


To make an extreme comparison, desktop Linux was originally supposed to happen in 1999.


Maybe I misspoke by saying open source.

The larger point I'm making is I think models are rapidly becoming commoditized. There is probably a small market long term that's willing to pay 10x for 10% marginal gains, but the majority of the buyers in the market will be economic and we're likely to have a lot of folks willing to spend 1/10 the cost for 90% of the performance, and plenty of companies that haven't raised hundreds of billions-trillions who can provide that.

A lot of the frontier labs valuations has been based on an assumption that 1-2 companies would get break-away intelligence that basically made them economic chokepoints indefinitely into the future. The reality that's becoming increasingly clear is that model quality is a pretty linear function of (cash burned - ability to copy other's homework) and the economics are starting to look a lot more like airlines than online advertising.


Lets go one step further.

The economics of airlines are such that they generally earn a return on capital less than cost of capital.

I think this is exactly where we are heading and OAI-Anthropic are the concordes.


Not OP, but it is a known fact that the cumulative profits of the airlines industry (in US) over it's history has been basically 0. We can say that essentially airlines are in business to support other businesses. I believe this is what OP might've been referring to.

Fuck, I loved grok 4.1, it was a really capable model for the money.

I'd run agents consuming hundreds of millions of tokens for less than a hundred dollars.


Billions in revenue just before your IPO isn't a bad deal either.


The icing on the cake for Elon is that it strengthens the competition to OpenAI.

Or is that actually his main motivation. Hard to know. Either way it's a win win win for him.


That's certainly one way one could spin this.

I guess loosing a ton of money then trying to get some if it back makes you a genius...


Yeah real geniuses go down with the ship and never change what they set out to do


Elon has many many faults but "loosing" money doesn't appear to be one of them. He's literally the richest person alive!


There's always money in the giggawatt datacenter


Models are a commodity, let's say Elon actually figures out building datacenters in space, or maybe he continues to be the leader of building earth based datacenters. Probably better business to not have yourself as your only customer. Dogfood, and open it to all.


> Elon actually figures out

Elon doesn't figure out anything. He pays people to do it and then tries to take the credit.


The first is impossible and the second isn't happening and won't happen.


I wouldn’t say impossible but not effective


the leader of building earth-based datacenters lol

what are we even talking about


I have never come close to my weekly limit, but have hit my hourly limit frequently.


Same. I hit limits after 45 minutes. I'm on a measly Pro plan. I'm usually building small, open source projects, often from scratch. I only work on these projects in a 2-hour window in the morning. This is my "free time" development. I hope this change helps, because I was days away from switching back to Codex, though I like Claude Code a bit better these days.

I also hope that the fact I had OpenClaw in my sandbox once is not why I hit these limits so damn fast. I don't use it anymore and I've tried to rid my sandbox of anything "openclaw" but it is in my git history in various places on various projects. Claude doesn't seem to be transparent about this limitation.


You should definitely try:

- Codex

- OpenCode Go

- Ollama Cloud

All are very useful, still a subscription, but with higher usage limits.

Specific providers like GLM also provide subscriptions like Z.ai.

Using DeepSeek, Kimi etc. through OpenRouter or from them directly is also great, here you pay per token but it's still more usage overall.


Are you using haiku for most tasks? I'm in the Google ecosystem so I'm curious how it is on the other side.


Nope, I use Opus 4.7, mostly. Sometimes Sonnet 4.6 if I’m trying to use less tokens.


For me it's the opposite. I almost never hit hourly limit, but I hit weekly limit in about 5 days.


Would be more meaningful if everyone said what plan they are on, as there are 3 different ones that users could be discussing.


Along with how many 5-hour windows they use in a day.

If you're using it 24/7 then yes, I'm sure the weekly limit is more of a concern.

If you're just using it during working hours - ie. you only use two 5-hour windows per day - then you probably, like me, struggle to hit the weekly limit even if you do max out some 5-hour windows.


last week with claude i saturated a team premium seat at day 6 of its cycle, and a max 20x seat at day 4, plus ~$150 extra usage spend, with a 60hr work week where i am not even primarily an IC, as well as a codex 20x plan at day 3 with a personal project


Hit weekly limits all the time with Pro. Too cheap to go for Max.


I'm on $200 Max plan


What does your usage look like day to day? Are you using a low level amount all day long? I'm with the others here, I've never hit the weekly limit ever, only the hourly, and I consider myself a heavy user.


I dedicate a significant amount of time to defining the precise actions that agents should perform (PRD/ADR). I break down the feature sets into Milestones and slices (tasks). These tasks are small, well-defined, and scoped. I have a prompt template that the “architect” agent prepares whenever I want to initiate a new feature. This ensures that the prompt structure remains consistent and standardized over time. The generated prompt is then pasted to the “orchestrator,” which performs context discovery (using Repoprompt) and finalizes the plan then proceeds to launch subagents to do the work.

Based on the size and complexity of the task, as well as any inter-task dependencies, the orchestrator deploys one or more subagents (sometimes 5 or 6 subagents) to work on these mini tasks. Once all tasks are completed, the orchestrator initiates verification and launches a review workflow. This workflow uses the original prompt, acceptance criteria, repository internal guidelines, and relevant skills to conduct a thorough review of the agents’ work.

Typically, there are one or two review iterations, during which the review agent identifies any issues. Sometimes, I may also notice issues and have to "steer" the orchestrator. The time required for a slice to complete ranges from 30 minutes to 4 or 5 hours, depending on its size, complexity, and the number of subtasks it contains.

Only if I run about 3 such orchestration in parallel I can reach hourly limit.


I have found that it uses a lot more tokens if I give it a very detailed todo and loop over every task 1 by 1. I now keep it to phases with detailed tasks underneath and use /loop over the phases and it uses a lot less. I also manage the context windows and tend to clear it often to keep it under around 200k (or less depending on project size)


Yeah, I do that too. Essentially, the system I described begins working on a task that is small enough and clearly defined. Each “slice” in a milestone usually have 5-10 subtasks (for instance, Slice E1 has P1...P6 subtasks). The orchestrator then receives the prompt to implement E1-P1.


It sounds like you are describing oh my open agent


I use Repoprompt's workflows for this. They are pretty good.


That’s because the week ends before you can use them because you’re waiting for your hourly resets. Now the week essentially got longer with the same limit


I hit my weekly limit in 3 days this week. Irregularly do in 5. With the top MAX sub.


Wow, then you are most likely doing something very wrong.


No, I'm just using it a lot. It's productive enough that I've found it worthwhile tacking on subs for GLM 5.1 and Kimi as well (GLM is fantastic, Kimi is good when it works but temperamental)


same, I struggle to use more than half of my weekly, even if I max out my 5-hour windows regularly during the day.


It amazes me how much people try to build AI systems relying on nothing more than the models knowledge. I suspect a great deal of "failed" AI experiments we keep reading are people just not having any idea how to use AI at what its good at.


Try running with Open Code. It works quite well.


I had an equally painful experience with Open Code. I don't think the harness is the issue. It's the need for a large context window and slow inference.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: