Anthropic wants to enforce them via language of the contracts and take a hands off approach. OpenAI has a contract that is paired with humans in the room (FDEs) that can pull the plug.
Sites like simonwillison.net/2025/jul/ and channels like https://www.youtube.com/@aiexplained-official also cover new model releases pretty quickly for some "out of the box thinking/reasoning" evaluations.
For me and my usage I can really only tell if I start using the new model for tasks I actually use them for.
My personal benchmark andrew.ginns.uk/merbench has full code and data on GitHub if you want a staring point!
A complete desktop computer with the M2 Ultra w/64GB of RAM and 1TB of SSD is $4k.
The 7995WX processor alone is $10k, the motherboard is one grand, the RAM is another $300. So you're up to $11300, and you still don't have a PSU, case, SSD, GPU....or heatsink that can handle the 300W TDP of the threadripper processor; you're probably looking at a very large AIO radiator to keep it cool enough to get its quoted performance. So you're probably up past $12k, 3x the price of the Studio...more like $14k if you want to have a GPU of similar capability to the M2 Ultra.
Just the usual "aPPle cOMpuTeRs aRE EXpeNsIVE!" nonsense.
So from a CPU perspective you get 7x the CPU throughput for 3x to 4x the price, plus upgradable RAM that is massively cheaper. The M2 uses the GPU for LLMs though, and there it sits in a weird spot where 64GB of (slower) RAM plus midrange GPU performance is not something that exists in the PC space. The closest thing would probably be a (faster) 48GB Quadro RTX which is in the $5000 ballpark. For other use cases where VRAM is not such a limiting factor, the comparably priced PC will blow the Mac out of the water, especially when it comes to GPU performance. The only reason we do not have cheap 96GB GDDR GPUs is that it would cannibalize NVIDIA/AMDs high margin segment. If this was something that affected Apple, they would act the same.
I didn't see benchmarks that suggest the 7950X is faster than M2 Ultra. I only saw performance numbers for 7995WX which has 6x the cores and 6x the cache.
Either way, I think these comparisons are moot since an M2 Ultra comes with 2x M2 Max GPUs and an NPU and up to 192GB of unified memory running at 800GB/s. In other words, you wouldn't want to run your LLM on the CPU if you have an M2 Ultra.
The point of OP is to increase LLM performance when you don't have a capable GPU.
Indeed they do, however companies like Meta (altruistically or not) are preventing OpenAI from building 'moats' by releasing models and architecture details in a very public way.
I think it's a safe bet to say it's not altruistic. And, if Meta were to wrestle away OpenAI's moat, they'd eagerly create their own, given the opportunity.
> And, if Meta were to wrestle away OpenAI's moat, they'd eagerly create their own
Meta is already capable of monetizing content generated by the models: these models complement their business and they could not care less which model you're using to earn them advertising dollars, as long as you keep the (preferably high quality) content coming.
> And, if Meta were to wrestle away OpenAI's moat, they'd eagerly create their own, given the opportunity.
At which point the new underdogs would have an interest in doing to them what they're doing to OpenAI.
Assuming progress for LLMs continues at a rapid pace for an extended period of time. It's not implausible that they'll get to a certain level past which non-trivial progress is hard, and if there is an open source model at that level there isn't going to be a moat.
Meta doesn't interact with its users in very obvious ways which MS, Google do. All its models magic happen behind the scenes. Meta can continue to release 2nd best models to undercut others and them going far too ahead. And Open Source community will take it from there. Dall-E is dead.
And if all open source extends their models, they can accrue those benefits back to themselves. This is already how they’ve become such a huge player in machine learning (open sourcing amazing stuff).
True but what you can do is SSH to the device and install a custom launcher for apps that can read standard epubs, play chess, or expose the linux terminal on device.
Not great for basic users but I've had significantly more use out of it with some advanced setup.
reply