I think this might be plausible in the future, but it needs a lot more tooling. ...

josephg · 2025-12-04T03:07:17 1764817637

Even the exact same model isn't enough. There are several sources of nondeterminism in LLMs. These would all need to be squashed or seeded - which as far as I know isn't a feature that openai / anthropic / etc provide.

BurningFrog · 2025-12-04T15:29:55 1764862195

OK, then the current models aren't as good as I thought/hoped.

I guess one thing it means is that we still need extensive test suites. I suppose an LLM can write those too.