Some time ago I interviewed with an AI governance think tank, and one of the topics that came up for ensuring Western-allied AI supremacy was trade security of compute resources. I didn’t want to shoot down their approach on this to their faces because they had a lot of blogs on their site, with OpenAI co-authors, about the efficacy of this approach. But I was never sold on the idea that holding off compute resources would stop someone from researching into better AI, I thought it would just push them to make more efficient AI. Lo and behold.
The article starts off on a pretty interesting premise: "I predicted there'd be no major advancements, but also, here are 4 major advancements in the last 4 weeks, many of which are predicated on another massive advancement that came out last year" (a SOTA model leveraging test-time compute)
We're also not seeing any price wars yet: every drop in price has been predicated on models getting faster, and while we don't know the size of models behind the scene, we can pretty reliably infer that they haven't been dropping their margins over time (at least, not at scale)
Anthropic even raised prices on their smallest model vs 2023 right as strong competition emerged.
Also seems strange to make a big deal about there being no moats... then imply that Meta deserves to be called up by Congress for releasing Llama's weights.
The only losers in this AI race are probably humanity and overvalued stock markets. Kind of hard to price in years of future value and then to see some unknown Chinese AIs catch the big boys with their pants down.
I really can't over emphasize the "humanity losing" point, though.
I don't feel like humanity is losing when my 7 year old gpu becomes an effective research assistant, secretary and busywork coding doer for the cost of downloading 10gb of tensors.
The brutal thing about humanity is that most of the time we don't spend much time at all thinking about how things today might impact things tomorrow. We also tend to think more about how things today can be help ourselves, without giving much thought on how it might negatively impact others.
Basically, we're selfish and shortsighted and so it's easy at times to have rose coloured glasses.
Mmmhmmm, i think about long term effects a lot, any particular problems with these AI models you're thinking of? Or is it just the garden variety lesswrongian AGI hard takeoff type beat?
I assume you’re some sort of a seriously high-level research scientist, given the high-level writing and use of the word lesswrongian. Otherwise, you have no way of understanding what this “research assistant“ is generating. Furthermore, if such high-level scientist, such as yourself, are no longer hiring research assistants, what humans will replace you? There will be few low-level scientist to move up to your level of expertise and provide validation of what’s generated by the AIs. Is that not a potential problem?
I think just like with robots working in factories AIs doing info work is not an issue of the AI. It's the economic system that is bad at capturing externalities that we're hell bent on keeping. On a base level I think you're right that it will cause problems, but it's not the AI itself that's the issue.
AI is reliant upon training data that prior to recent years was generated by humans. If going forward, humans are no longer part of the training data and humans are no longer capable of verifying the results and output of AI, the quality and value of AI output would be degraded and not sustainable especially regarding recent research.
The end result given the scenario is that humans become less capable and the AI becomes less capable. So where is the capability come from then?
> Two years ago, they were on top of the world, having just introduced ChatGPT, and struck a big deal with Microsoft. Nobody else had a model close to GPT-4 level; media coverage of OpenAI was endless; customer adoption was swift. They could charge more or less what they wanted, with the whole world curious and no other provider. Sam Altman was almost universally adored. People imagined nearly infinite revenue and enormous profits.
Part of me likes to think of this as cosmic karma. I know it's been hounded on a lot lately, but the irony is too rich, especially with how hostile they've been to "Open AI."
Isn’t Deepseek’s advantage that they didn’t actually start from scratch and that embeddings & training was already supplied? To say that Deepseek is comparable in performance to OpenAI is like saying Kirkland brands is comparable to {insert non white labeled good here}- they’re created off the same inputs. To say that Deepseek is a threat to AI supremacy is hyperbolic. As long as OpenAI innovates at the rate that it’s been innovating, their value is undeniable. Sure, Deepseek may tick after OpenAI’s tock, but the premium is in that tock.
This take discounts the implications of the fact that the efficiency and cost of Deepseek's product are orders of magnitudes better, plus the project is open source. Those facts fundamentally shift the business conditions for LLMs.
Since we’re (as in the human race) going to go ahead and push for AGI with reckless abandon, the best thing that can happen is for it to be available to everyone, no moats. I can’t help but feel a bit of schadenfreude given that this was supposed to be the very principle that OpenAI was founded on and that they abandoned at the first whiff of money. Now, it looks like it’s going to happen whether they want it to or not.
If it only cost $5.5 million for DeepSeek to create R1, what’s stopping OpenAI from building on their open research to create something even better for $500 million?
Everyone keeps talking about diminishing returns in training and plateaus in reasoning ability but it seems to me that’s exactly what R1 demonstrates with a 100x reduction in training costs: there is a long way to go and loads of low hanging fruit.
In grand scheme of things, I still think it doesn’t matter. “American AI labs are years ahead” illusion is being challenged right now. If it’s even 80% as good, and x30 cheaper, you’ll have a very simple choice when you’re making a product. And unlike EVs, this isn’t that easy to just import control.
Question: Suppose you wanted to train an LLM using say 100 GPUs, but for some reason you could not obtain the GPUs. You could however obtain CPUs. Each CPU lacks the massive parallelism of each GPU, so might have a throughout of say 1/100th the TFLOPS of the GPU.
Could you use 10000 CPUs and get a similar level of performance that 100 GPUs would have given?
The US may ban export of GPUs to China, but a GPU is not enough to build a training rig. You need memories, and hundred of smaller components. China has the ability to produce every component which goes in a computer. Can the US do the same?
Come on now. That's a super clickbait title. The article actually says the race is at a dead heat and LLM training for the same competency is getting cheaper, not that it's "over".
After reading the article I understand "over" to mean "there is no moat so there is no winner" rather than "it's over because entity X won and its competitors can't catch up"
I'd consider "over" to be someone won, or it's not relevant anymore. After reading the article, it's neither of those. The race is very much on, but now it's clear that it will be competitive.
Nope. That's not going to make it over, obviously. The race is still on. Just because Marcus called a tie at the moment and thinks it will be difficult to create a lead doesn't mean they won't be racing every step of the way.
In what world do you live in where races end just because the score is tied?
I agree, I don't think it's over just because another major power threw a bunch of similarly powered models in the ring.
I feel it doesn't really matter which country or company has which model open at the moment. Whichever company, open source-based or closed that hits AGI or something much closer to it first wins, period. We're not there yet, we've got lovely conversationalists summarizing all our web crawled answers back to us. It doesn't seem like throwing chips and compute at the problem is the answer, and stifling competition via hardware is just bailing water. I do think he's at least kinda right that it's going to take another paradigm shift in LLMs, traders and other models to some other idea that eventually does it. I don't expect it out of GPT-5.
The race being over has nothing to do with whether or not there has been a tie, it depends on whether or not people want to run. If AI models become easier to train, there will be less investor appetite towards big players like OpenAI. Example: https://techcrunch.com/2025/01/27/nvidia-drops-600bn-off-its...
> The article actually says the race is at a dead heat and LLM training for the same competency is getting cheaper, not that it's "over".
Odd, that's exactly what I interpreted the title to mean. "Supremacy" means one side has a total and insurmountable lead and control. That's how lots of China hawks in the West talk about it, arguing that they must do everything to stop China from building their own ChatGPT. The article is a pretty good summary of why no one side will achieve supremacy. And if you have been following Gary Marcus, he's been arguing this for a while now.