throw219080123's comments

throw219080123 · 2025-10-08T12:20:11 1759926011

> consider the fact that both Gemini and OpenAI got gold medal level performance

Yet ChatGPT 5 imagines API functions that are not there and cannot figure out basic solutions even when pointed to the original source code of libraries on GitHub.

simonw · 2025-10-08T13:11:42 1759929102

Which is why you run it in a coding agent loop using something like Codex CLI - then it doesn't matter if it imagines a non-existent function because it will correct itself when it tries to run the code.

Can you expand on "cannot figure out basic solutions even when pointed to the original source code of libraries on GitHub"? I have it do that all the time and it works really well for me (at least with modern "reasoning" models like GPT-5 and Claude 4.)

steveklabnik · 2025-10-08T17:31:45 1759944705

As a human, I sometimes write code that does not compile first try. This does not mean that I am stupid, only that I can make mistakes. And the development process has guardrails against me making mistakes, namely, running the compiler.

alickz · 2025-10-08T19:54:13 1759953253

Agreed

Infallibility is an unrealistic bar to mark LLMs against

orbital-decay · 2025-10-08T14:15:26 1759932926

Yes. I don't see why these have to be mutually exclusive.

buildbot · 2025-10-08T17:09:01 1759943341

I feel they are mutually inclusive! I don’t think you can meaningfully create new things if you must always be 100% factually correct, because you might not know what correct is for the new thing.

throw219080123 · 2025-10-03T21:11:22 1759525882

True. But if the system is implemented by a country this could be implemented using the law system and insurances.

For example, when each transaction is done, both parties might keep a cryptographic proof which they are required to submit once they are online again.

Failing to submit could result in a small fine (to encourage submission) and double spending which can then be detected could result in a large fine (or even a prison sentence), for example.

There is, perhaps, a privacy issue, just like with blockchain. But it's not more of an issue than online transactions.

mosdl · 2025-10-03T21:26:41 1759526801

You don't need a blockchain for that, see how old credit cards worked without network access.

throw219080123 · 2025-10-03T21:31:37 1759527097

I didn't say you need a blockchain. I just said a cryptographic protocol (mostly offline and unrelated to blockchain) would help to automatically and quickly detect and proof fraud.

The offline credit card system does not proof fraud but just has insurance.

throw219080123 · 2025-10-03T17:48:39 1759513719

Zero moat.

vessenes · 2025-10-03T17:58:53 1759514333

I don't completely agree. Brand value is huge. Product culture matters.

But say you're correct, and follow the reasoning from there: posit "All frontier model companies are in a red queen's race."

If it's a true red queen's race, then some firms (those with the worst capital structure / costs) will drop out. The remaining firms will trend toward 10%-ish net income - just over cost of capital, basically.

Do you think inference demand and spend will stay stable, or grow? Raw profits could increase from here: if inference demand 8x, then oAI, as margins go down from 80% to 10%, would keep making $10bn or so a year in FCF at current spend; they'd decide if they wanted that to go into R&D or just enjoy it, or acquire smaller competitors.

Things you'd have to believe for it to be a true red queen's race:

* There is no liftoff - AGI and ASI will not happen; instead we'll just incrementally get logarithmically better.

* There is no efficiency edge possible for R&D teams to create/discover that would make for a training / inference breakaway in terms of economics

* All product delivery will become truly commoditized, and customers will not care what brand AI they are delivered

* The world's inference demand will not be a case of Jevon's paradox as competition and innovation drives inference costs down, and therefore we are close to peak inference demand.

Anyway, based on my answers to the above questions, oAI seems like a nice bet, and I'd make it if I could. The most "inference doomerish" scenario: capital markets dry up, inference demand stabilizes, R&D progress stops still leaves oAI in a very, very good position in the US, in my opinion.

dabockster · 2025-10-03T18:18:44 1759515524

The moat, imo, is mostly the tooling on top of the model. ChatGPT's thinking and deep research modes are still superior to the competition. But as the models themselves get more and more efficient to run, you won't necessarily need to rent them or rent a data center to run them. Alibaba's Qwen mixture of experts models are living proof that you can have GPT levels of raw inference on a gaming computer right now. How are these AI firms going to adapt once someone is able to run about 90% of raw OpenAI capability on a quad core laptop at 250-300 watts max power consumption?

vessenes · 2025-10-03T19:17:48 1759519068

I think one answer is that they'll have moved farther up the chain; agent training is this year, agent-managing-agents training is next year. The bottom of the chain inference could be Qwen or whatever for certain tasks, but you're going to have a hard and delayed time getting the open models to manage this stuff.

Futures like that are why Anthropic and oAI put out stats like how long the agents can code unattended. The dream is "infinite time".

jamesjyu · 2025-10-03T18:35:26 1759516526

Huge brand moat. Consumers around the world equate AI with ChatGPT. That kind of recognition is an extremely difficult thing to pull off, and also hard to unseat as long as they play their cards right.

minimaxir · 2025-10-03T18:39:04 1759516744

"Brand moat" is not an actual economic concept. Moats indicate how easy/hard it is to switch to a competitor. If OpenAI does something user-adversarial, it takes two seconds to switch to Anthropic/Gemini (the exception being Enterprise contracts/lock-in, which is exactly why AI companies prioritize that). The entire reason that there are race-to-the-bottom price wars among LLM companies is that it's trivial for most people to switch to whatever's cheapest.

Brand loyalty and users not having sufficient incentive by default to switch to a competitor is something else. OpenAI has lost a lot of money to ensure no such incentive forms.

corentin88 · 2025-10-03T18:57:07 1759517827

McDonald’s has brand moat. So does Coca-Cola. And many more products. The switching cost is null, but the brand does it all.

minimaxir · 2025-10-03T19:00:46 1759518046

Again, that's brand loyalty, not a brand moat.

Moats, as noted in Google's "We Have no Moat, and Neither Does OpenAI" memo that made the discussion of moats relevant in AI circles, has a specific economic definition.

Sammi · 2025-10-03T21:12:25 1759525945

The Seven Powers is considered an authoritarian source on business moats.

https://www.goodreads.com/book/show/32816087-7-powers

It has branding as one of the seven and uses coca cola as an example.

qcnguy · 2025-10-04T11:09:04 1759576144

Switching costs only make sense to talk about for fully online businesses. The "switching cost" for McDonalds depends heavily on whether there's a Burger King nearby. If there isn't then your "switching cost" might now be a 30 minute drive, which is very much a moat.

Incipient · 2025-10-04T13:19:00 1759583940

That's not entirely true. They have a 'infinite' product moat - no one can reproduce a big mac. Essentially every AI model is now 'the same' (queue debate on this). The only way they can build a moat is by adding features beyond the model that lock people in.

hluska · 2025-10-03T19:27:48 1759519668

The concept of ‘moat’ comes out of marketing - it was a concept in marketing for decades before Warren Buffett coined the term economic moat. Brand moat had been part of marketing for years and is a fully recognized and researched concept. It’s even been researched with fMRIs.

You may not see it, but OpenAI’s brand has value. To a large portion of the less technical world, ChatGPT is AI.

poopiokaka · 2025-10-03T19:52:57 1759521177

Still not a moat tho

tim333 · 2025-10-04T09:02:41 1759568561

A moat can be being the largest in a field, often the case with Buffett investments eg. Coca cola, Apple, Geico.

smcin · 2025-10-04T19:51:17 1759607477

Nokia's global market share was ~50% in smartphones back in 2007. Remember that?

Comparing "brand moat" in real-world restaurant vs online services where there's no actual barrier to changing service is silly. Doubly silly when they're free users, so they're not customers. (And then there are also end-users when OpenAI is bundled or embedded, e.g. dating/chatbot services).

McDonald's has lock-in and inertia through its franchisees occupying key real-estate locations, media and film tie-ins, promotions etc. Those are physical moats, way beyond a conceptual "brand moat" (without being able to see how Hamilton Wright Helmer's book characterizes those).

Handy-Man · 2025-10-03T17:58:15 1759514295

Having sticky 800M WAU is a moat.

trcf22 · 2025-10-03T18:27:49 1759516069

I wouldn’t necessarily say so. I guess that’s what they are trying to « pulse » people and « learn » from you instead of just providing decent unbiased answers.

In Europe, most companies and Gov are pushing for either mistral or os models.

Most dev, which, if I understand it correctly, are pretty much the only customers willing to pay +100$ a month, will change in a matter of minutes if a better model kicks in.

And they loose money on pretty much all usage.

To me a company like Antropics which mostly focus on a target audience + does research on bias, equity and such (very leading research but still) has a much better moat.

throw219080123 · 2025-10-03T14:30:55 1759501855

To save months the output would have to be reliable, but it isn't. It saves very little, especially at data input and coding.

It does save time for tasks where the output is easily checked, such as image generation and translations. But the quality is often mediocre.

infecto · 2025-10-03T15:45:39 1759506339

I cannot say for your work but for classification steps and data structuring it’s quite accurate and this is with regular testing. I cannot speak for your work but for mine and folks in adjacent industries, LLM are fantastic and adding a lot of value to our workflows. You’re honestly holding on to this dead idea that LLM outputs are full of hallucinations. Throwaway account with throwaway comment.

throw219080123 · 2025-10-03T17:35:53 1759512953

It's "quite accurate" which is not acceptable for almost all relevant tasks is a business context. Somebody needs to manually check everything. Almost no time is saved.

Talking as someone who has built many small OpenAI integrations aka wrappers in business apps.

infecto · 2025-10-03T21:10:51 1759525851

So I know you are trolling but let’s be real. Why would I share my business measurements with you on here? Of course my experience is an an anecdote and of course the measurements I use to determine accuracy are more in depth than “quite accurate”. What I can say is for structure we beat humans on accuracy and do it extremely quick. That LLM system runs at a lower cost than some of our more traditional system.

If all you have done is built wrappers I see you have bare scratches the engineering surface so that explains it.