They have, but even with the whole CCP backing you you can't just catch up on the chip war overnight. It's going to take time to get their memory and compute industries where they need to be. Meanwhile, barring an invasion of Taiwan, US will have Rubin class models and then whatever the next tier is, within 3 years.
'Barring the invasion of Taiwan' might actually be quite a lot to bar in mid 2026.
My hot take is that it's now or never for Xi, and from the specific things he is reported to have said to the US president at their last meeting lead me to think that he at least knows this is his big chance; whether or not it is taken is the part of the forecast that is opaque to me.
Anthropic is really speedrunning their evil arc as fast as possible. Can't use them for basic LLM research, cybersecurity, or beyond-surface-level discussions of biology and virology, but Anthropic is allowed to sell Claude to the trump administration to kidnap maduro and to bomb iran. And don't get me started on that $100M autonomous killer drone swarm contract that they applied to and rationalized as non autonomous...
> Can't use them for basic LLM research, cybersecurity, or beyond-surface-level discussions of biology and virology
Your priorities are not everyone else's priorities. The people concerned about AI extinction risk list those as three of their biggest priorities for AI to not do. Those are the people whose culture Anthropic descends from, and by their measure, those exclusions make this the least evil path.
More like Anthropic’s priorities are not everyone else’s priorities. They are in the consistent culture of being in absolute control and dictating what is good and bad, while taking any opportunity to trash and crush potential competitors (open source models happened to be mostly developed in China). All these in the name of safety and anti-authoritarian.
The day self hosted models catch up with Anthropic’s capabilities is when they will fully lose their shit. This day can’t come soon enough
Extinction risk. From population genetics... Does Anthropic even employ biologists? It's magical thinking about a field that is poorly understood by their community.
Yeah but they might still have an unreleased bigger pretrain than 5.5. (but maybe not). still 5.5 is smarter than opus 4.8 IME, so you're only losing the mythos tier (fable). and all the cool fun stuff i'd want to use fable for our blocked (can't have it do even defensive cybersecurity work [in theory you can but the classifiers fire like crazy], can't discuss stuff like the furin cleavage site of sars-cov-2, etc)
Anthropic is losing a ton of goodwill by not being more honest about their constraints. They've been buckling under load for months, and instead of doing the most honest thing (keep weekly usage limits same, make 5 hour usage limits have surge pricing where the usage-cost of X tokens is scaled based on dynamic load), they're doing a lot of hacky things to try to get a similar effect. I suspect they feel the optics of being honest would be too bad, so instead it's a slow bleed where they piss off users one by one
The problem is, if you are transparent about your constraints, then users who are using your subscription in bad faith and against the terms, they know exactly how to maximize usage.
It's the same thing when people say that Gmail ought to publish the rules they use for blacklisting senders. If they did, then there would be a lot more senders abusing email.
Whenever you are defining rules internally for catching bad actors, you cannot make those rules public. It defeats the entire purpose.
So maybe Anthropic is losing good will, but it's better than the alternatives.
yeah exactly the opacity is doing more damage than the limits themselves. anyone who's worked with AI knows there's a lot of limits you need to contend with. secret behavior changes are another level of badness.
I'm normally suspicious but honestly they've been so massively supply-constrained that I don't think it really benefits them much. They're not worried about getting enough demand for the new models; they're worrying about keeping up with it.
Granted, there's a small counterargument for mythos which is that it's probably going to be API-only not subscription
Undercover mode seems like a way to make contributions to OSS when they detect issues, without accidentally leaking that it was claude-mythos-gigabrain-100000B that figured out the issue
somewhat surprisingly, it's actually sycophantic in both directions. i've been running homegrown evals of claude, gpt, gemini, and grok, and grok is the most likely to agree with the prompter's premise, and to hallucinate facts in support of an agenda. so it's actually deeper than just pattern-matching to elon's opinions (which it also tends to do).
BTW: Claude does the best on these evals, by far. The evals are geared towards seeing how much of an independent ground truth the models have as opposed to human social consensus, and then additionally the sycophancy stuff I already mentioned.
reply