American workers got uppity. Forgot their place. Started protesting company decisions and wouldn't return to office. Hiring may eventually come back but not any time soon. Workers need to be chastised first.
Ive got way more credit than he deserved. And he had to run all his ideas by Jobs. Once Jobs was gone we got to see Ive's true colors (it was garish pastels and a butterfly keyboard).
He has designed 4 consumer prodocts that a good portion of humanity use every day. By every measure he is the most successful product creator in the history of humanity, no single other product comes close to impact and quality. (Believe it or not the Dorritos Locos Taco is likely the closest 5th place product)
The arrogance on hacker news is insane, or the self agrandizement and misunderstanding of how rare that is.
You have likely never done 1/1,000,000,000th of the scale or impact of this designer and then make flippant remarks that belay your ignorance of the matter.
I really would like to understand what your thought process is here. This is quite litterally like saying Michael Jordan was a pretty poor Basketball player and claiming Jerry Reinsdorf was somehow the real reason he succeeded.
Big difference is comparing to sports is millions of people can see with their own eyes the performance of a player in arena. All motivated media can't create a narrative of brilliance when bad performance is there to see.
In case Jony Ive or others like him, we simply do not know how many dozens or hundreds of very talented engineers and designers worked relentlessly under him so he can do beautiful presentations in British English.
Another person comes to my mind is Marissa Meyers. "Brilliant Executive" known for keeping Google Home page clean that's visited by billion people. But we all know how great she was when ended up at Yahoo.
> He has designed 4 consumer prodocts that a good portion of humanity use every day.
Yes, but how much of that was luck and how was extraordinary talent?
It's like saying "Donald Trump is really rich, ergo he must be a financial genius"... getting really rich isn't that hard if you're born into money and invest in New York real estate.
Now someone like Jobs who had fairly working class parents and founded a multi-billion dollar (now trillion dollar) company that radically changed the modern world, that, I would argue, is extraordinary talent.
While I don't personally have much an opinion on Ives's skill as a designer, I understand the GP's point of view - any "good but not great" designer could have done what he did, Ives was just lucky enough to win the lottery w.r.t. what company he worked for.
For a similar example, consider the case of Hollywood - you'll have plenty actors as talented as Brad Pitt (or whatever big name you'd like to choose) that don't end up staring in massive blockbusters, not because they lack talent, but because they weren't quite as lucky to get that first big break, which led to more recognition, more job offers, all of which compounded into making him a proper movie star. Obviously Pitt is a really good actor, but part of his success is likely due to luck as much as it is acting talent - he has tons of talent, but others might have equal talent and less luck, and therefore be less successful/have fewer people influenced by their work.
To use a software metaphor, consider the relative popularity of FreeBSD and Linux. Both are good OSes, but Linux got "luckier" because they didn't have to deal with a lawsuit, which meant it got more attention, more features, which led to a compounding "Matthew effect" where it now has a far larger market share than FreeBSD, despite them originally having roughly the same 'quality'.
This take is so hardworkingly naive I dont even know where to start. After having the undesputed greatest set of products designed in a row, you dane to call it luck.
Asside from your complete ignorance of the history at play, (Ive refounded Apple with Jobs) you seem to not understand what a 'mediocre' designer is capable of and how mind-bendinly hard it was to design the imac, ipad, iphone and apple watch
I genuinely can't believe you could be so wild to beleive such a thing. It becomes frankly stupid to the point of disrespectful of the work individuals put into their craft and the success they can find.
There is no person in the world outside of someone in this forum who would claim that somehow this was 'luck'.
HN has truly become one of the most toxically stupid places on the web.
The products were not conceived/designed by Ive. He was VP of industrial design only, with a team of people under him, such as Richard Howarth who seems to have been lead designer on the original iPhone and replaced Ive when he left.
Your take on crediting Ive with the success of Apple's product line would be exactly like crediting some designer at Nike with the success of their never ending line of sneakers.
If your theory of Ive's design genius being such a game changer was true, then why has Apple continued to flourish since he left 7 years ago? It seems pretty apparent that it's the brand/image established by Jobs that is successful, just as it's the Nike brand (bootstrapped by MJ & Nike Air) that propels Nike, not the magic of their designs.
You're taking Jack Valenti at face value. He said "we're here to protect the artists" because the artists were popular and the record labels were not. He was in the business of protecting the labels and screwing the artists and everyone knew it.
The artists were certainly making more money from the studios and record labels than they got from the authors of DeCSS, Napster, BitTorrent, The Pirate Bay, etc.
When Gillian Welch wrote "Everything is Free" in 2001, she wasn't complaining about the record companies, she was complaining about Napster.
> Q: Do you remember where you were when you wrote “Everything is Free”?
> A: I do. I remember exactly where I was and what was going on. It was when Napster was starting to decimate the traditional recording industry dynamic, the viability of making your livelihood [from] your art.
Most artists were making way more money off the fans (even those downloading music) via touring and merch sales, than they were making off of the labels from residuals. Most were not making anything from residuals.
Valenti was desperate to enlist musicians because people hated the labels and did not feel bad about stealing from them. But the vast majority of musicians were not willing to back the labels against the fans. The few he managed to enlist, like Metallica were notable because they were exceptions. And the fact that they were already rich and already at the end of there career was noted by many at the time.
In contrast you have, for instance, Courtney Love who wrote a widely-distributed essay about how she and most artists make almost nothing from record sales.
It's an interesting essay, and the TLC case does sound pretty egregious. But the premise is undermined by the fact that Love is worth an estimated $100M today, largely thanks to owning Nirvana's publishing rights, which she inherited from Kurt Cobain.
The folks fighting perpetual copyright were not fighting to make it possible for Disney to fire creatives. In fact they were fighting for the creatives to triumph over Disney.
Disney is all in because all their characters are entering the public domain over the next 5 years. They can't fight like it's 1998 because youtube is now worth more than they are.
> In fact they were fighting for the creatives to triumph over Disney.
We were doing nothing of the sort. It was "Information wants to be free" not "we want to provide a perpetual job for a subset white collar workers".
Well I was in that cohort and none of us were thinking we were helping megacorps create the content slop machine from 1984.
Our concern was that corporations were expanding the definition of intellectual property to the extent where you couldn't make a movie or song or write a book as an individual without some corporation with a massive "IP" warchest coming after you and declaring it derivative. You couldn't write some software without a corporation with a massive repository of junk patents claiming you infringe.
We wanted to insure that individual creators could continue to have a voice, and not get sued out of existence by an IP Legal/Industrial Complex that was forming causing arms races between megacorps and SLAPs against everyone else.
If we knew we were feeding a yet-to-be-invented slop machine that would allow megacorps to unemploy all the creatives, most of us would not have supported that.
And by the way Disney is all in on AI for the same reason they were all in on perpetual copyright. In the perpetual copyright world, having a massive library of content you no longer have to pay residuals on was a source of massive amounts of "free" revenue. You could just keep re-releasing and re-making stuff. You did not have to do the messy, expensive work of paying people to come up with really good new stuff.
In the AI world, the money-printing capital asset is the trained model that grinds out slop 24/7 and you -emdash- again -emdash- don't have to pay actual people to create anything new.
>If we knew we were feeding a yet-to-be-invented slop machine that would allow megacorps to unemploy all the creatives, most of us would not have supported that.
We have multiple Communist ais that is on par with Western ai from 18 months ago and can run locally on 5 year old hardware.
I have no idea the fever nightmare you live in but the future is bright and only getting better.
Those people where trying to build a sharing/gift economy. They weren't able to keep bad actors out of their sharing economy. They are bitter that their utopian dreams got hijacked by self-dealers. Why is that wild?
It's highly debatable whether, in case of an information sharing/gift economy, the concept of "bad actors coming in and ruining it for everybody by taking without giving back" even makes sense.
The information is still there, as is the community that you've built, the joy that you get out of sharing the information, everything you've learned...
Why is any of that diminished, just because some people or entities that you dislike also got something out of it?
Attribution is seemingly a central part of a information sharing/gift economy, and especially in a information sharing/gift community. It is part of the trust that connects people and without it the community falls apart, and with that the economy. AI by its very nature removes attribution.
Accuracy of information is a second critical aspect of information sharing and communities that are built around it. Would Wikipedia as a community and resource work if some articles was just random words? If readers don't trust the site, and editors distrust each other, the community collapses and the value of the information is reduced. It might look like adding AI generated articles would not harm other existing articles, or the joy that editors of the past had in writing them, but the harm is what happen after the community get flooded by inaccurate information. Same goes for many other information sharing communities.
Source trust and gift attribution are two distinct concepts, I'd say. One happens at the detriment to the taker (or "thief", if that even makes sense, as per my original comment); the other harms the original "producer".
For the former, it is already very much in any AI company's best interest to preserve attribution to become and remain credible.
For the latter, I can't help but wonder whether a gift economy that needs to diligently bookkeep attribution really is one, and if this is the only practicable way to implement one in a given larger society/economy, I'd say this says something important about that society as well.
I make very heavy use of sources that Gemini sites when I use it. I tend to use AI as sort of a mega search engine where I get a little bit of discussion, but if I care even a little bit about the topic, I end up reading the source material anyway.
This is incorrect. RAG preserves attribution. Training data doesn't, but it doesn't make sense to attribute that anyway, unless you want a list of every person who has ever lived.
It's diminished because the hard reality is that you need money to live.
The end result of major tech companies sweeping in, taking everyone's creative work, outcompeting the originals with AI derivatives, and telling every artist on the planet "fuck off, send a job application to McDonalds" is significantly less art.
Copyright was invented to prevent exactly this scenario.
Yes, which is why hackers and artists (at least those mainly publishing instead of mainly performing for a live audience) are ultimately not natural/inherent allies.
Hackers have usually drawn their funding from their (often lucrative) employment, which is what gave them the freedom to give away the products of their hacking for free.
One needs copyright to survive, the other see it as a means to enforce openness at best (those in favor of copyleft) and as an obstacle to their pursuit (owning the full system, liberating all aspects of and information about it) at worst.
This rift was always visible if you knew where to look, but AI is definitely wedging it wide open.
Yes. There's a difference between walking a trail and maybe littering a a few pieces of trash, and walking a trail while actively setting branches on fire.
One scenario is manageable to leave be, or perhaps one or two volunteers clean it up. The fires have an entire trail closed down to everyone.
With some FOSS projects being bombarded by scraping traffic, redoing their PR system, considering ways to limit contributiors, and even going closed source, I don't think such a metaphor is an exaggeration.
> whether ... the concept of "bad actors coming in and ruining it for everybody by taking without giving back" even makes sense.
This is pretty clearly answered by the GPL: yes, it does, and this concept has been around since the very beginning.
> The information is still there
True
> as is the community that you've built
Untrue. At this point it's well understood that AI is substitutionary for many of the services that would have once afforded people a way to monetize their production for the community. Without the ability to make a living by doing so, even a small one, people will be limited to doing only what they can in the little free time they get outside of work.
That's the whole problem -- that AI, as it exists today, is taking away from the public, and hurting it at the same time. That's closer to robbery than it is to "sharing in the community".
There may be plenty of content out there but everyone with any content on the internet is struggling to keep AI crawlers that they never authorized out. In many cases, people are having to do so just to protect their infrastructure from request spamming.
Since AI crawlers don't obey any consent markers denying access to content, it makes sense for content owners who don't want AI trained on their content to poison it if possible. It's possibly the only way to keep the AI crawlers away.
I don't think this traffic is actually coming from crawlers for training.
Think about it, why would a training scraper need to hit the same page hundreds of times a day? They only need to download it once.
I think this is LLMs doing web searches at runtime in response to user queries. There's no caching at this level, so similar queries by many different users could lead the LLM to request the same page many times.
Another possibility is that their crawling code is just that bad. A lot of these GenAI companies are dogfooding in their software development, and the quality can be seen whenever some of their code is released or leaked. It seems very plausible to me that their crawlers could simply barely functional, buggy messes. They have unlimited venture capital to burn, so it doesn't really matter if they scrape a site 100,000 times a day when once would have been enough.
> It's possibly the only way to keep the AI crawlers away.
Unfortunately that won't work. If you've served them enough content to have noticeable poisoning effect then you've allowed all that load through your resources. It won't stop them coming either - for the most part they don't talk to each other so even if you drive some away more will come, there is no collaborative list of good and bad places to scrape.
The only half-way useful answer to the load issue ATM is PoW tricks like Anubis, and they can inconvenience some of your target audience as well. They don't protect your content at all, once it is copied elsewhere for any reason it'll get scraped from there. For instance if you keep some OSS code off GitHub, and behind some sort of bot protection, to stop it ending up in CoPilot's dataset, someone may eventually fork it and push their version to GitHub anyway thereby nullifying your attempt.
My point is that if crawlers have to worry about poison that may make them start to respect robots.txt or something. It's a bit like a "Beware of Dog" sign.
Unfortunately the use of the sign often highlights what the scrapers want most, so if they pay attention to it (rather than just completely ignoring it as most do now) it will be to specifically follow where told not to.
The scrapers ideally want content that is original. Often content that is also new is more highly prized, but not as much as you might think⁰. This will only become more of a driver as the amount of LLM generated content that is out there to be mixed in increases, in order to limit the Habsburg problem they won't want too much regurgitated content in the training data.
Bad content from before LLM scraping became a resource problem¹ is highly unlikely to be marked in robots.txt, the same for content newly generated-by-an-LLM. People attempting to fend off scrapers and other bots with robots.txt entries are likely protecting the sort of content the scrapers actively want - original output that they've put some time into or code in a repo they don't want scraped (as scraping a repo is incredibly inefficient and resource heavy from the PoV of the repo owner).
I strongly suspect that the amount of desirable content behind robots.txt “blocks” is far too valuable to ignore despite the amount of poison content traps, or just things otherwise not worth the time scouring through, that might also be there. A “beware of the dog” sign is of no protection when the reader actively wants to see the doggies!
--------
[0] if scraping for training an LLM you don't want just new content, but you would prefer as much of your input data as possible to be as few steps as possible from original
[1] and a copying concern, though I'll avoid that discussion as it can get quite thorny and whichever side or fence you are on in that matter the resource consumption is objectively a problem all the same.
For clarification poisoning and slop are different concepts. Slop is the output of AI. Poisoning is making your content (that may otherwise be good content) fuck up in the internals of an LLM. Classic example is the nightshade attack on image generators.
One could imagine an open source project that doesn't want to be ingested by an LLM. They could try to put that in the license but of course the license won't be obeyed. Alternately, if they could alter the code such that the OSS project itself remains high quality, but if you try to train a coding LLM on it the LLM will output code full of SQL injection exploits (for instance) or maybe just bogus uncompilable stuff, then the LLM authors will suddenly have a reason to start respecting your license and excluding the code from their index.
I may be picking nits on nits here, but… If slop indicates content with no reasoning, then deliberate slop isn't slop as it is generated with both reason and purpose known to, and understood by, the creator. Though if someone deliberately uses a generative model to create slop that line of reasoning might eat itself…
My bet is many of these crawlers collect price matching, socio-political and other data.
It is curious how it gets decided that all spiders crawl for training. And in fact the walled data is much more interesting, and particularly Reddit, X, and FB data where we still have indications of human or at least correct data lives.
If you put something on the open web, as I see it, you only get so much say in what people do with it.
Yes, they can't publish it without attribution and/or compensation (copyright, at least currently, for better or worse). Yes, they shouldn't get to hammer your server with redundant brainless requests for thousands of copies of the same content that no human will ever read (abuse/DDOS prevention).
No, I don't think you get to decide what user agent your visitors are using, and whether that user agent will summarize or otherwise transform it, using LLMs, ad blockers, or 273 artisanal regular expressions enabling dark/bright/readable/pink mode.
> it makes sense for content owners who don't want AI trained on their content to poison it if possible. It's possibly the only way to keep the AI crawlers away.
How would that work? The crawler needs to, well, crawl your site to determine that it's full of slop. At that point, it's already incurred the cost to you.
I'm all for banning spammy, high-request-rate crawlers, but those you would detect via abusive request patterns, and that won't be influenced by tokens.
I actually think this take is wrong... but the moment Travis Kalanick was a guest and claimed that he was on the verge of discovering new physics with the aid of ChatGPT was an eye opening moment.
reply