Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Certainly the undisputed winners will be the very few firms with enough engineering resources and GPUs to train their own models (not just fine-tune) where the models in question increase the productivity of workers in their non-ai-related profit centers. After that we have the real question of what the future will be of open source LLMs, on the one hand, and the question most relevant to this article of what sort and whether profitable “AI businesses” can be sustained over time. As Stratechery has analyzed, it is very possible that OpenAI turns out to be a very profitable B2C company with ad revenue in ChatGPT not concerned with their B2B sales or even the objective quality of their AI. Right now is an incredible time for AI: cheap Uber rides never qualitatively changed my life, but the current consumer access to AI models is truly incredible and I hope that only improves. However, even ignoring whatever happens on the regulatory front, I don’t think that is guaranteed at all.


This was definitely a theory that made people burn tons of money on the past couple of years, but I don’t think it holds water. These models are getting obsolete so fast, and there’s so many open ones, I doubt any one’s privately trained model can stay relevant for long


The data is the moat.

(If you can train your internally deployed LLM on data none of your competitors have, that's an advantage).


It's not anymore. If the model is publicly accessible, its skills can be distilled by performing some API calls and recording input-output pairs. This scheme works so well it has become the main mode to prepare data for small models. Model skills leak.


I agree, publicly deployed models seem to be easy to train from. I did say "internally deployed LLM" though. agentcoops said "...where the models in question increase the productivity of workers in their non-ai-related profit centers" above, that's the bit I was thinking about. I think private models, either trained from scratch or fine-tuned, are going to be a big deal though they won't make the PR splash that public models make.


The conclusion for that seems to be that it just yields a model that has the surface look and feel of GPT3 or 4 but without the depth, so the experience quickly becomes unsatisfactory once you go out of the fine tuning dataset.


You may not need to train a model to make use of your data though. Maybe a cheap fine tune would work just as well. Maybe just having the data well indexed and/or part of the prompt context is good enough.


In that case, X.AI, powered by X/Twitter/Tesla data and possibly Facebook (both closed, and somewhat hard to crawl inside) have the largest moat.


I don’t think they necessarily will be allowed to train on their data unless they get explicit permission. They will try, but the way I see privacy revaluations is that users will have to authorize specific uses of their data and not be surprised by any application.

This could be one of the more interesting privacy fights of the next decade.

I’m sure there are easy cynical takes about how they will just shrink wrap the EULA, and maybe they will. But in a good privacy environment, users should never be surprised and have control over how their data is used. And I think we’ve made some progress there.


> I don’t think they necessarily will be allowed to train on their data unless they get explicit permission. They will try, but the way I see privacy revaluations is that users will have to authorize specific uses of their data and not be surprised by any application.

If there's one company that I don't think cares about user permissions or the law, it'd be Twitter.

The EU officially warned Elon about DSA fines and the response was less than serious.

https://www.cnn.com/2023/10/10/tech/x-europe-israel-misinfor...


China probably has the most comprehensive data on its users from a surveillance perspective


Idk. These were trained on pretty public things like Wikipedia.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: