Is this in academia? Arguably, the emergence of quant hedge funds and private AI...

godelski · on Jan 27, 2025

  > Is this in academia?

Yes and no. Industry AI research is currently tightly coupled with academic research. Most of the big papers you see are either directly from the big labs or in partnership. Not even labs like Stanford have sufficient compute to train GPT from scratch (maybe enough for DeepSeek). Here's Fei-Fei Li discussing the issue. Stanford has something like 300 GPUs[1]? And those have to be split across labs.

The thing is that there's always a pipeline. Academic does most of the low level research, say TRL[2] 1-4, partnerships happen between 4-6, and industry takes over the rest. (with some wiggleroom on these numbers). Much of ML academic research right now is tuning large models, made by big labs. This isn't low TRL. Additionally, a lot of research is rejected for not out-performing technologies that are already at TRL 5-7. See Mamba for a recent example. You could also point to KANs, which are probably around TRL 3.

  > Arguably, the emergence of quant hedge funds and private AI research companies is at least as much a symptom of the dysfunctions of academia

Which is where I, again, both agree and disagree. It is not _just_ a symptom of the dysfunction of academia, but _also_ industry. The reason I pointed out the grumpy researchers is because a lot of these people have been discussing techniques that DeepSeek used, long before they were used. DeepSeek looks like what happens when you set these people free. Which is my argument, that we should do that. Scale Maximalists (also alled "Bitter Lesson Maximalists", but I dislike the term) have been dominating ML research, and DeepSeek shows that scale isn't enough. So will hopefully give the mathy people more weight. But then again, is not the common way monopolies fall is because they become too arrogant and incestuous?

So mostly, I agree, I'm just pointing out that there is a bit more subtly and I think we need to recognize that to make progress. There are a lot of physicists and mathy people who like ML and have been doing research in the area but are often pushed out because of the thinking I listed. Though part of the success of the quant industry is recognizing that the strong math and modeling skills of physicists generalize pretty well and you go after people who recognize that an equation that describes a spring isn't only useful for springs, but is useful for anything that oscillates. That understanding of math at that level is very powerful and boy are there a lot of people that want the opportunity to demonstrate this in ML, they just never get similar GPU access.

[0] https://www.ft.com/content/d5f91c27-3be8-454a-bea5-bb8ff2a85...

[1] https://archive.is/20241125132313/https://www.thewrap.com/un...

[2] https://en.wikipedia.org/wiki/Technology_readiness_level