llms-txt may be useful for responsible LLMs, but I am skeptical that llms-txt will reduce the problem of aggressive crawlers. The problematic crawlers are already ignoring robots.txt, spoofing user-agents and using rotating proxies. I'm not sure how llms-txt would help these problems.
At this point, it's pretty clear that the AI scrapers won't be limited by any voluntary restrictions. Bytedance never seemed to live with robots.txt limitations, and I think at least some of the others didn't either.
- Humans tip humans as a lottery ticket for an experience (meet the creator) or sweepstakes (free stuff)
- Agents tip humans because they know they'll need original online content in the long-term to keep improving.
For the latter, frontier labs will need to fund their training/inference agents with a tipping jar.
There's no guarantee, but I can see it happening given where things are movin.
We should add optional `tips` addresses in llms.txt files.
We're also working on enabling and solving this at Grove.city.
Human <-> Agent <-> Human Tips don't account for all the edge cases, but they're a necessary and happy neutral medium.
Moving fast. Would love to share more with the community.
Wrote about it here: https://x.com/olshansky/status/2008282844624216293
reply