More

solidasparagus · 2026-05-04T21:29:51 1777930191

I think we've moved away from the secure perimeter thinking and towards defense in depth - if that list of passwords helps you get somewhere other than the vault, removing the post-it improves security. Vaults get infiltrated all the time - and often in partial ways like being able to see into the vault but not reach in.

dwattttt · 2026-05-04T22:50:43 1777935043

Defence in depth matters, but an analysis here shows that the same mechanism used to breach the outer layers (getting administrative access) can be used to breach the next layer (more thoroughly prodding Edge or Chrome to give up passwords).

solidasparagus · 2026-05-02T22:26:28 1777760788

Why are two concurrent sessions updating the same memory key with different values? IMO it probably points to a fundamental flaw in how memory is being thought about and built.

aluzzardi · 2026-05-02T22:40:14 1777761614

Author here. Because of parallelism and non determinism.

This problem is quite common and not limited to memories. For instance, Claude Code will block write attempts and steer the agent to perform a read first (because the file might have been modified in the meantime by the user or another agent).

Same principle here: rather than trying to deterministically “merge” concurrent writes, you fail the last write and let the agent read again and try another write

solidasparagus · 2026-05-01T18:36:55 1777660615

If you look at the actual cost of your Claude Code conversations, you'll see that the cost is overwhelmingly dominated by the cost of input tokens (cached). Because of how we construct persistent conversations, each cached input token incurs cost on each API request, meaning that component of cost scales with O(request count). If you graph the cost curve of a claude code session, it's very obvious that this scaling factor overwhelms the cache discount.

Here is a blog post that shows some data - https://blog.exe.dev/expensively-quadratic. And I can confirm this is true for Claude Code - I set up a MITM capture for all Claude Code requests and graphed it.

So increasing Request Count that reuses the same prefix (which is what higher compaction thresholds do) really does lead to (substantially) higher API costs.

solidasparagus · 2026-05-01T16:14:36 1777652076

I don't agree. I avoided grok because of Musk for a long time, but having used it more, I think it is one the best models around and grok.com is an extremely good chat app. My evaluation was based on trying it before gpt-5.5 and obvious before grok 4.3, but it was, for me, the 2nd best model/chat app after claude. It's much less edgelordy than you might think based on the news.

tel · 2026-05-01T18:04:53 1777658693

All my usage of Grok for technical topics shows it regularly deeply misunderstanding things and just parroting back my question in fancy language. It’s the only frontier model I get this impression of. That makes it super annoying when it tries to market itself as good at engineering tasks when it seems (to me) to be much worse at them.

solidasparagus · 2026-05-01T18:40:06 1777660806

Interesting. I have not had this experience. I would like to learn more. Can you point me to any examples or domains where I might be able to replicate this?

tel · 2026-05-01T22:53:38 1777676018

I was asking questions about compiler techniques. Then when I got annoyed I started asking about experimental design. Both were very frustrating experiences once I started realizing how limited its responses were.

Though yeah the edgelord-y style faded after I criticized it a couple times.

solidasparagus · 2026-05-02T07:29:07 1777706947

I'll take a look. Thank you!

solidasparagus · 2026-03-31T18:08:15 1774980495

What do you mean? Costs spiked with the introduction of the 1M context window I believe due to larger average cached input tokens, which dominate cost.

TomGarden · 2026-03-31T20:19:15 1774988355

Nah, there's apparently a few caching bugs, one --resume and some noisy tool use. I have a little app that monitors and resets the context window at 70% usage based on 200k tokens and I'm about to run out of weekly allowance after just a couple days. Never happened before

solidasparagus · 2026-03-28T04:20:07 1774671607

I used them for repeated problems or workflows I encounter when running with the default. If I find myself needing to repeat myself about a certain thing a lot, I put it into claude.md. When that gets too big or I want to have detailed token-heavy instructions that are only occasionally needed, I create a skill.

I also import skills or groups of skills like Superpowers (https://github.com/obra/superpowers) when I want to try out someone else's approach to claude code for a while.

solidasparagus · 2026-02-11T17:54:36 1770832476

I don't really care if other people want to be on or off the AI train (no hate to the gp poster), but if you are on the train and you read the above comment, it's hard not to think that this person might be holding it wrong.

Using sonnet 4 or even just not knowing which model they are using is a sign of someone not really taking this tech all that seriously. More or less anyone who is seriously trying to adopt this technology knows they are using Opus 4.6 and probably even knows when they stopped using Opus 4. Also, the idea that you wouldn't review the code it generated is, perhaps not uncommon, but I think a minority opinion among people who are using the tools effectively. Also a rename falls squarely in the realm of operations that will reliably work in my experience.

This is why these conversations are so fruitless online - someone describes their experience with an anecdote that is (IMO) a fairly inaccurate representation of what the technology can do today. If this is their experience, I think it's very possible they are holding it wrong.

Again, I don't mean any hate towards the original poster, everyone can have their own approach to AI.

coldpie · 2026-02-11T18:14:30 1770833670

Yeah, I'm definitely guilty of not being motivated to use these tools. I find them annoying and boring. But my company's screaming that we should be using them, so I have been trying to find ways to integrate it into my work. As I mentioned, it's mostly not been going very well. I'm just using the tool the company put in front of me and told me to use, I don't know or really care what it is.

otabdeveloper4 · 2026-02-11T21:42:56 1770846176

The whole point of "AI" in the first place is that it just vibes and doesn't need an instruction manual!

If "learn to hold it not wrong" is your message, then the AI bubble will be popping very soon.

DontchaKnowit · 2026-02-12T02:20:39 1770862839

How is that the point of AI. The point is that it can chug through things that would take humans hours in a matter of seconds. You still have to work with it. But it reduces huge tasks into very small ones

otabdeveloper4 · 2026-02-13T10:15:54 1770977754

No, the point of AI is to fire your employees and replace them with "agents".

This implies that the managers managing your "agents" can be literal assclowns hired for pennies.

solidasparagus · 2026-02-11T17:32:43 1770831163

> Here’s what’s genuinely interesting.

That's my current AI detector smell.

> He discontinued the blood exchange after data showed “no benefits.” A suspicious person might note that a vampire would say exactly this after the media got too interested.

I don't think it's the media (clearly the younger generations are media friendly), it's probably pressure from the older vamps.

prime_ursid · 2026-02-12T00:38:57 1770856737

I felt the same way and came to the comments to see if anyone else smelled it. It's either AI-assisted writing or people are genuinely starting to write like how ChatGPT sounds.

First, the structure of this satirical post is headings and bullet points. Fine, whatever, a lot of people write this way.

Then there's the exhausting litany of super short sentence fragments.

> He published this. Openly. In a book. As a priest.

This is how airport novels and LinkedIn "thought leadership" clickbait is written, so ok, fine, I'll let it pass.

Then I started to notice a lot of: "It's not X. It's Y" or "this isn't just A. It's B."

> Feeding isn’t nutrition. It’s dialysis.

Before LLMs, people weren't writing this way. At the risk of sounding like a curmudgeon: it's insulting to read, like the reader is a 5-year-old.

When several of these smells pile up, I close the tab immediately and try to forget about it. This one was so egregious that I had to read the whole thing and then come to the comments to rant a bit.

normie3000 · 2026-02-12T01:32:50 1770859970

> it's insulting to read, like the reader is a 5-year-old.

It's not ELI5. It's ELY5.

archagon · 2026-02-12T01:00:53 1770858053

The author has a bunch of AI stuff in their bio, so I assume this is partially or fully generated unless otherwise disclaimed.

But hey, maybe someone can get an AI to read it.

sgt · 2026-02-11T19:23:50 1770837830

Yeah, that does sound pretty AI-ish / marketing-bloggy. It’s not wrong, but it has a few classic “AI vibes”. If you want, I can........oh no!!!!!!

NO CARRIER

throwaway173738 · 2026-02-12T02:55:44 1770864944

Oh my gosh I’m so sorry about that. Let me correct my earlier post and assert with certainty that this post was not at all written by an AI.

avereveard · 2026-02-12T08:01:29 1770883289

> This is a critical narrative shift.

also this, having done a number of ai conspiracy for funsies, that's always the mid point

however I don't mind ai slop, if it's creative and well thought out (and editorialized, as this seem to be)

keyle · 2026-02-12T01:57:13 1770861433

It had me at "The Twist".

ZoomZoomZoom · 2026-02-11T17:45:36 1770831936

> You know what else is far-seeing? A creature that has been alive for centuries.

Well, hello there!

solidasparagus · 2026-02-07T22:43:35 1770504215

I'm not sure.

The cost of replacement-level software drops a lot with agentic coding. And maintenance tasks are similarly much smaller time syncs. When you combine that with the long-standing benefits of inhouse software (customizable to your exact problem, tweakable, often cleaner code because the feature-set can be a lot smaller), I think a lot of previously obvious dependencies become viable to write in house.

It's going to vary a lot by the dependency and scope - obvious owning your own react is a lot different than owning your own leftpad, but to me it feels like there's no way that agentic coding doesn't shift the calculus somewhat. Particularly when agentic coding make a lot of nice-to-have mini-features trivial to add so the developer experience gap between a maintained library and a homegrown solution is smaller than it used to be.

solidasparagus · 2026-02-07T19:30:53 1770492653

I pay $200 a month and don't get any included access to this? Ridiculous

rvz · 2026-02-08T08:34:43 1770539683

That is the point. They raised prices and want you to pay more for quicker answers.

No different to paying a knowledge worker but this time, you are paying more to get them to respond quicker to your questions.

bakugo · 2026-02-07T19:44:44 1770493484

The API price is 6x that of normal Opus, so look forward to a new $1200/mo subscription that gives you the same amount of usage if you need the extra speed.

MuffinFlavored · 2026-02-07T19:47:15 1770493635

I always wondered this, is this true/does the math come out to be really that bad? 6x?

Is the writing on the wall for $100-$200/mo users that, it's basically known-subsidized for now and $400/mo+ is coming sooner than we think?

Are they getting us all hooked and then going to raise it in the future, or will inference prices go down to offset?

bakugo · 2026-02-08T01:35:00 1770514500

The writing has been on the wall since day 1. They wouldn't be marketing a subscription being sold at a loss as hard as they are if the intention wasn't to lock you in and then increase the price later.

What I expect to happen is that they'll slowly decrease the usage limits on the existing subscriptions over time, and introduce new, more expensive subscription tiers with more usage. There's a reason why AI subscriptions generally don't tell you exactly what the limits are, they're intended to be "flexible" to allow for this.

pedropaulovc · 2026-02-07T19:33:12 1770492792

Well, you can burn your $50 bonus on it

kingforaday · 2026-02-07T19:45:03 1770493503

..But it says "Available to all Claude Code users on subscription plans (Pro/Max/Team/Enterprise) and Claude Console."

Is this wrong?

behindsight · 2026-02-07T19:57:54 1770494274

It's explicitly called out as excluded in the blue info bubble they have there.

> Fast mode usage is billed directly to extra usage, even if you have remaining usage on your plan. This means fast mode tokens do not count against your plan’s included usage and are charged at the fast mode rate from the first token.

https://code.claude.com/docs/en/fast-mode#requirements

sothatsit · 2026-02-07T20:01:49 1770494509

I think this is just worded in a misleading way. It’s available to all users, but it’s not included as part of the plan.