Your piece is well-written, but embeds an important assumption that leads to your conclusion being different from Ed's.
> And how can TV retailers make money in this situation? Did they expect to keep charging $500 for a TV that’s now really worth $200, and pocket the $300 difference?
> Why, then, is Ed Zitron having such a hard time when it comes to LLM inference? It’s exactly the same situation!
The AI situation is not analogous to one where the TVs initially costed $450 to manufacture and the stores were selling them for $500, then the manufacturing cost went down.
The equivalent TV analogy is that we're selling $600-cost TVs for $500 hoping that if people start buying them, the cost will drop to $200 so we can sell them for $300 at a profit. In that situation, if people keep choosing the $600-cost/$500-price unprofitable TVs, the existence of the $200-cost/$300-price profitable TVs that people aren't buying don't tell us anything about the market's future.
---
In the AI scenario that prompts all the conversations about the "cost of inference", the reason that we care about the cost is that we believe that it's currently *ABOVE* what the product is being sold for, and that VC money is being used to subsidise the users to promote the product. The story is that as the cost drops, it will eventually be below the amount that users are willing to pay, and the companies will magically switch to being profitable.
In that scenario, anything which forces the cost above the revenue is a problem. This applies to customers choosing to switch to more expensive models, customers using more of the service (due to reasoning) while paying fixed rates, or customers remaining on free plans rather than switching to affordable profitable paid plans.
The AI Hype group believes that the practical cost of providing inference services to users will drop enough that the $20/month users are profitable.
The AI Hype group's argument is that because the cost per token is coming down, that means we're on a trajectory to profitability.
The AI Bubble group believes that the practical cost of providing inference services to users is not falling fast enough.
Ed's argument is that despite the cost per token coming down, the cost per request is not coming down (because requests now require more advanced models or more tokens per request in order to be useful), so we are not on a trajectory to profitability.
The AI Bubble group (which I'm most likely a member of) also believe that the current added value of AI is obscured by anthropomorphization (regular people WANT the AI to be as smart as a human being), insane levels of marketing, FOMO from executive levels/shareholder/capitalists (basically) - all hoping to get rid of the cost of labor.
Outside of coding, the current wave of AI is:
* a slightly more intuitive search but with much "harder" misfires - a huge business on its own but good luck doing that against Google (Google controls its entire LLM stack, top to bottom)
* intuitive audio/image/video editing - but probably a lot more costly than regular editing (due to misfires and general cost of (re-)generation) - and with rudimentary tooling, for now
* a risky way to generate all sorts of other content, aka AI slop
All those current business models are right now probably billion dollar industries, but are they $500 billion/year industries to justify current spending? I think it's extremely unlikely.
I think LLM tech might be generating $500 billion/year worth of revenues across the entire economy, but probably in 2035. Current investors are investing for 2026, not 2035.
> And how can TV retailers make money in this situation? Did they expect to keep charging $500 for a TV that’s now really worth $200, and pocket the $300 difference?
> Why, then, is Ed Zitron having such a hard time when it comes to LLM inference? It’s exactly the same situation!
The AI situation is not analogous to one where the TVs initially costed $450 to manufacture and the stores were selling them for $500, then the manufacturing cost went down.
The equivalent TV analogy is that we're selling $600-cost TVs for $500 hoping that if people start buying them, the cost will drop to $200 so we can sell them for $300 at a profit. In that situation, if people keep choosing the $600-cost/$500-price unprofitable TVs, the existence of the $200-cost/$300-price profitable TVs that people aren't buying don't tell us anything about the market's future.
---
In the AI scenario that prompts all the conversations about the "cost of inference", the reason that we care about the cost is that we believe that it's currently *ABOVE* what the product is being sold for, and that VC money is being used to subsidise the users to promote the product. The story is that as the cost drops, it will eventually be below the amount that users are willing to pay, and the companies will magically switch to being profitable.
In that scenario, anything which forces the cost above the revenue is a problem. This applies to customers choosing to switch to more expensive models, customers using more of the service (due to reasoning) while paying fixed rates, or customers remaining on free plans rather than switching to affordable profitable paid plans.
The AI Hype group believes that the practical cost of providing inference services to users will drop enough that the $20/month users are profitable.
The AI Hype group's argument is that because the cost per token is coming down, that means we're on a trajectory to profitability.
The AI Bubble group believes that the practical cost of providing inference services to users is not falling fast enough.
Ed's argument is that despite the cost per token coming down, the cost per request is not coming down (because requests now require more advanced models or more tokens per request in order to be useful), so we are not on a trajectory to profitability.