More

swalsh · 2026-06-02T21:45:25 1780436725

I'd assume it's not up to par with Qwen-3.5 then, which has been distilling Claude, and the quality of the model is probably a direct result of that.

swalsh · 2026-05-28T22:17:20 1780006640

Its amazing how quickly ive just become accustomed to being a max subscriber. I dont think I could go back to pro.

galkk · 2026-05-29T04:46:29 1780029989

Then max+, then ultra, then ultra pro

stefanfisk · 2026-05-29T06:13:34 1780035214

As long as they provide the same utility / $ I don’t see why not. It’s not like the open weight models are that far behind and Claude code itself shouldn’t be very hard for the commmunity to replicate if Anthropic start acting up too much.

swalsh · 2026-05-27T20:59:55 1779915595

Open source models, especially qwen are pretty dang good. But its not opus 4.6, the evals dont tell the full story. I question the assumption open source models are 3-6 months out.

Ucalegon · 2026-05-27T21:27:37 1779917257

Its not just about the quality of output, but you also can finetune them to proprietary needs, if the skillsets are their internally, to make them better without governance risks. So being SOTA doesn't matter as much, since generalized tasks are not what matter most to companies, its the specialization relative to business need or internal datasets.

oblio · 2026-05-27T21:33:29 1779917609

To make an extreme comparison, desktop Linux was originally supposed to happen in 1999.

simplyluke · 2026-05-27T21:53:06 1779918786

Maybe I misspoke by saying open source.

The larger point I'm making is I think models are rapidly becoming commoditized. There is probably a small market long term that's willing to pay 10x for 10% marginal gains, but the majority of the buyers in the market will be economic and we're likely to have a lot of folks willing to spend 1/10 the cost for 90% of the performance, and plenty of companies that haven't raised hundreds of billions-trillions who can provide that.

A lot of the frontier labs valuations has been based on an assumption that 1-2 companies would get break-away intelligence that basically made them economic chokepoints indefinitely into the future. The reality that's becoming increasingly clear is that model quality is a pretty linear function of (cash burned - ability to copy other's homework) and the economics are starting to look a lot more like airlines than online advertising.

grttq · 2026-05-27T23:43:35 1779925415

Lets go one step further.

The economics of airlines are such that they generally earn a return on capital less than cost of capital.

I think this is exactly where we are heading and OAI-Anthropic are the concordes.

extraextra · 2026-05-29T16:25:31 1780071931

Not OP, but it is a known fact that the cumulative profits of the airlines industry (in US) over it's history has been basically 0. We can say that essentially airlines are in business to support other businesses. I believe this is what OP might've been referring to.

swalsh · 2026-05-06T17:25:51 1778088351

Fuck, I loved grok 4.1, it was a really capable model for the money.

I'd run agents consuming hundreds of millions of tokens for less than a hundred dollars.

swalsh · 2026-05-06T17:24:25 1778088265

Billions in revenue just before your IPO isn't a bad deal either.

fancyfredbot · 2026-05-06T18:09:11 1778090951

The icing on the cake for Elon is that it strengthens the competition to OpenAI.

Or is that actually his main motivation. Hard to know. Either way it's a win win win for him.

throwa356262 · 2026-05-06T20:09:10 1778098150

That's certainly one way one could spin this.

I guess loosing a ton of money then trying to get some if it back makes you a genius...

scottyah · 2026-05-07T00:25:20 1778113520

Yeah real geniuses go down with the ship and never change what they set out to do

fancyfredbot · 2026-05-07T06:58:56 1778137136

Elon has many many faults but "loosing" money doesn't appear to be one of them. He's literally the richest person alive!

swalsh · 2026-05-06T17:22:55 1778088175

There's always money in the giggawatt datacenter

swalsh · 2026-05-06T17:21:37 1778088097

Models are a commodity, let's say Elon actually figures out building datacenters in space, or maybe he continues to be the leader of building earth based datacenters. Probably better business to not have yourself as your only customer. Dogfood, and open it to all.

driverdan · 2026-05-07T01:05:06 1778115906

> Elon actually figures out

Elon doesn't figure out anything. He pays people to do it and then tries to take the credit.

mplewis · 2026-05-06T17:22:24 1778088144

The first is impossible and the second isn't happening and won't happen.

croes · 2026-05-06T17:25:49 1778088349

I wouldn’t say impossible but not effective

nextstep · 2026-05-06T17:29:48 1778088588

the leader of building earth-based datacenters lol

what are we even talking about

swalsh · 2026-05-06T17:15:59 1778087759

I have never come close to my weekly limit, but have hit my hourly limit frequently.

codazoda · 2026-05-06T17:58:56 1778090336

Same. I hit limits after 45 minutes. I'm on a measly Pro plan. I'm usually building small, open source projects, often from scratch. I only work on these projects in a 2-hour window in the morning. This is my "free time" development. I hope this change helps, because I was days away from switching back to Codex, though I like Claude Code a bit better these days.

I also hope that the fact I had OpenClaw in my sandbox once is not why I hit these limits so damn fast. I don't use it anymore and I've tried to rid my sandbox of anything "openclaw" but it is in my git history in various places on various projects. Claude doesn't seem to be transparent about this limitation.

bryanhogan · 2026-05-07T12:43:32 1778157812

You should definitely try:

- Codex

- OpenCode Go

- Ollama Cloud

All are very useful, still a subscription, but with higher usage limits.

Specific providers like GLM also provide subscriptions like Z.ai.

Using DeepSeek, Kimi etc. through OpenRouter or from them directly is also great, here you pay per token but it's still more usage overall.

piyh · 2026-05-06T18:04:34 1778090674

Are you using haiku for most tasks? I'm in the Google ecosystem so I'm curious how it is on the other side.

codazoda · 2026-05-06T18:53:45 1778093625

Nope, I use Opus 4.7, mostly. Sometimes Sonnet 4.6 if I’m trying to use less tokens.

mirzap · 2026-05-06T17:16:49 1778087809

For me it's the opposite. I almost never hit hourly limit, but I hit weekly limit in about 5 days.

nickthegreek · 2026-05-06T17:37:18 1778089038

Would be more meaningful if everyone said what plan they are on, as there are 3 different ones that users could be discussing.

jizzywizzy · 2026-05-06T21:20:01 1778102401

Along with how many 5-hour windows they use in a day.

If you're using it 24/7 then yes, I'm sure the weekly limit is more of a concern.

If you're just using it during working hours - ie. you only use two 5-hour windows per day - then you probably, like me, struggle to hit the weekly limit even if you do max out some 5-hour windows.

replygirl · 2026-05-06T17:45:28 1778089528

last week with claude i saturated a team premium seat at day 6 of its cycle, and a max 20x seat at day 4, plus ~$150 extra usage spend, with a 60hr work week where i am not even primarily an IC, as well as a codex 20x plan at day 3 with a personal project

noisy_boy · 2026-05-07T01:15:46 1778116546

Hit weekly limits all the time with Pro. Too cheap to go for Max.

mirzap · 2026-05-06T17:41:06 1778089266

I'm on $200 Max plan

extr · 2026-05-06T17:31:23 1778088683

What does your usage look like day to day? Are you using a low level amount all day long? I'm with the others here, I've never hit the weekly limit ever, only the hourly, and I consider myself a heavy user.

mirzap · 2026-05-06T18:04:42 1778090682

I dedicate a significant amount of time to defining the precise actions that agents should perform (PRD/ADR). I break down the feature sets into Milestones and slices (tasks). These tasks are small, well-defined, and scoped. I have a prompt template that the “architect” agent prepares whenever I want to initiate a new feature. This ensures that the prompt structure remains consistent and standardized over time. The generated prompt is then pasted to the “orchestrator,” which performs context discovery (using Repoprompt) and finalizes the plan then proceeds to launch subagents to do the work.

Based on the size and complexity of the task, as well as any inter-task dependencies, the orchestrator deploys one or more subagents (sometimes 5 or 6 subagents) to work on these mini tasks. Once all tasks are completed, the orchestrator initiates verification and launches a review workflow. This workflow uses the original prompt, acceptance criteria, repository internal guidelines, and relevant skills to conduct a thorough review of the agents’ work.

Typically, there are one or two review iterations, during which the review agent identifies any issues. Sometimes, I may also notice issues and have to "steer" the orchestrator. The time required for a slice to complete ranges from 30 minutes to 4 or 5 hours, depending on its size, complexity, and the number of subtasks it contains.

Only if I run about 3 such orchestration in parallel I can reach hourly limit.

calgoo · 2026-05-06T18:15:21 1778091321

I have found that it uses a lot more tokens if I give it a very detailed todo and loop over every task 1 by 1. I now keep it to phases with detailed tasks underneath and use /loop over the phases and it uses a lot less. I also manage the context windows and tend to clear it often to keep it under around 200k (or less depending on project size)

mirzap · 2026-05-06T18:37:57 1778092677

Yeah, I do that too. Essentially, the system I described begins working on a task that is small enough and clearly defined. Each “slice” in a milestone usually have 5-10 subtasks (for instance, Slice E1 has P1...P6 subtasks). The orchestrator then receives the prompt to implement E1-P1.

jLaForest · 2026-05-06T23:29:02 1778110142

It sounds like you are describing oh my open agent

mirzap · 2026-05-07T06:20:06 1778134806

I use Repoprompt's workflows for this. They are pretty good.

culopatin · 2026-05-06T20:30:42 1778099442

That’s because the week ends before you can use them because you’re waiting for your hourly resets. Now the week essentially got longer with the same limit

vidarh · 2026-05-06T18:14:55 1778091295

I hit my weekly limit in 3 days this week. Irregularly do in 5. With the top MAX sub.

scottyah · 2026-05-07T00:19:36 1778113176

Wow, then you are most likely doing something very wrong.

vidarh · 2026-05-07T07:35:51 1778139351

No, I'm just using it a lot. It's productive enough that I've found it worthwhile tacking on subs for GLM 5.1 and Kimi as well (GLM is fantastic, Kimi is good when it works but temperamental)

headcanon · 2026-05-06T17:26:04 1778088364

same, I struggle to use more than half of my weekly, even if I max out my 5-hour windows regularly during the day.

swalsh · 2026-04-29T13:08:01 1777468081

It amazes me how much people try to build AI systems relying on nothing more than the models knowledge. I suspect a great deal of "failed" AI experiments we keep reading are people just not having any idea how to use AI at what its good at.

swalsh · 2026-04-22T22:48:32 1776898112

Try running with Open Code. It works quite well.

docheinestages · 2026-04-23T09:25:44 1776936344

I had an equally painful experience with Open Code. I don't think the harness is the issue. It's the need for a large context window and slow inference.