I'm sure this is impressive, but it's probably not the best test case given how many C compilers there are out there and how they presumably have been featured in the training data.
This is almost like asking me to invent a path finding algorithm when I've been thought Dijkstra's and A*.
It's a bit disappointing that people are still re-hashing the same "it's in the training data" old thing from 3 years ago. It's not like any LLM could 1for1 regurgitate millions of LoC from any training set... This is not how it works.
A pertinent quote from the article (which is a really nice read, I'd recommend reading it fully at least once):
> Previous Opus 4 models were barely capable of producing a functional compiler. Opus 4.5 was the first to cross a threshold that allowed it to produce a functional compiler which could pass large test suites, but it was still incapable of compiling any real large projects. My goal with Opus 4.6 was to again test the limits.
In this case it's not reproducing training data verbatim but it probably is using algorithms and data structures that were learned from existing C compilers. On one hand it's good to reuse existing knowledge but such knowledge won't be available if you ask Claude to develop novel software.
I wouldn't say I need to invent much that is strictly novel, though I often iterate on what exists and delve into novel-ish territory. That being said I'm definitely in a minority where I have the luxury/opportunity to work outside the monotony of average programming.
The part I find concerning is that I wouldn't be in the place I am today without spending a fair amount of time in that monotony and really delving in to understand it and slowly push outside it's boundary. If I was starting programming today I can confidently say I would've given up.
They're very good at reiterating, that's true. The issue is that without the people outside of "most humans" there would be no code and no civilization. We'd still be sitting in trees. That is real intelligence.
"This AI can do 99.99%* of all human endeavours, but without that last 0.01% we'd still be in the trees", doesn't stop that 99.99% getting made redundant by the AI.
* vary as desired for your preference of argument, regarding how competent the AI actually is vs. how few people really show "true intelligence". Personally I think there's a big gap between them: paradigm-shifting inventiveness is necessarily rare, and AI can't fill in all the gaps under it yet. But I am very uncomfortable with how much AI can fill in for.
Here's a potentially more uncomfortable thought, if all people through history with potential for "true intelligence" had a tool that did 99% of everything do you think they would've had motivation to learn enough of that 99% to give insight into the yet discovered.
This is a good rebuttal to the "it was in the training data" argument - if that's how this stuff works, why couldn't Opus 4.5 or any of the other previous models achieve the same thing?
They couldn't do it because they weren't fine-tuned for multi-agent workflows, which basically means they were constrained by their context window.
How many agents did they use with previous Opus? 3?
You've chosen an argument that works against you, because they actually could do that if they were trained to.
Give them the same post-training (recipes/steering) and the same datasets, and voila, they'll be capable of the same thing. What do you think is happening there? Did Anthropic inject magic ponies?
Because for all those projects, the effective solution is to just use the existing implementation and not launder code through an LLM. We would rather see a stab at fixing CVEs or implementing features in open source projects. Like the wifi situation in FreeBSD.
LLMs can regurgitate almost all of the Harry Potter books, among others [0]. Clearly, these models can actually regurgitate large amounts of their training data, and reconstructing any gaps would be a lot less impressive than implementing the project truly from scratch.
(I'm not claiming this is what actually happened here, just pointing out that memorization is a lot more plausible/significant than you say)
The first thing I do when setting up new Chrome instance, is disabling almost every API in its settings, including Notifications API. You can always enable it later for a few selected websites (I'm using it for Telegram Web), and rest of the websites will just silently rejected.
I would bet money that 99+% of the population never goes into their web browser's settings. Unfortunately I don't know of any way to prove or disprove that.
You could also stop using chrome and switch to waterfox or some other mozilla derivative.
If you have a site that "requires" chrome, you can easily add a user-agent switcher extension to fool whatever JS nonsense that claims to require chrome.
It’s kind of a given that the average HN reader won’t have any problems with this (like who is opening a browser without ublock origin on it?) - but like 90% of the population are powerless against this literal cyber bullying; it’s really sad
There are examples on YouTube of laughter tracks being removed and there are lots of awkward pauses, so I think you'd need to edit the video to cut the pauses out entirely.
Cutting the pauses will change the beats and rhythm of the scene, so you probably need to edit some of the voice lines and actual scenes too then. In the end, if you're not interested in the original performance and work, you might as well read the script instead and imagine it however you want, read it at the pace you want and so on.
No reason to try to avoid semantic search. Dead easy to implement, works across languages to some extent and the fuzziness is worth quite alot.
You're realistically going to need chunks of some kind anyway to feed the LLM, and once you got those it's just a few lines of code to get a basic persistant ChromaDB going.
If you touch the image when scrolling on mobile then it opens when you lift your finger. Then when you press the cross in the corner to close the image, the search button behind it is activated.
How can a serious company not notice these glaring issues in their websites?
Taiwanese companies still don't value good software engineering, so talented developers who know how to make money leave. This leaves enterprise darlings like Asus stuck with hiring lower tier talent for numbers that look good to accounting.
I use uBlock Origin (the full fat version in Firefox, not the lite version). It doesn't help, because the pop-ups aren't ads. There's one asking me if I wanna be spied on, one asking me to subscribe or sign in, and one huge one telling me that there's currently a discount on subscriptions.
I've got uBlock Origin on Firefox desktop too, and none of those show. Turn on more of the filter lists in the settings - especially the stuff in the "Cookie notices" and "Annoyances" sections.
I mean, that's a two-sided deal. "You watch ads, you read content". But that deal has been more and more broken by the ad networks and websites; a lot of sites are unnavigable without an adblocker now.
The days of plain text Google AdWords are long, long gone.
Only those who made the mistake of not using a content filter like uBlock Origin or something equally effective. I just visited the site and got neither pop-ups nor ads.
They are a respectable institution, they only lie about WMDs in order to justify war and death. That is why we need to protect their IP, to preserve their activities in the future.
Just like when they prematurely publish peoples deaths: They write up all this stuff ahead of time, and automatic stock monitoring to TRIGGER FIRST TO GLOAT
This is almost like asking me to invent a path finding algorithm when I've been thought Dijkstra's and A*.
reply