Exactly. I spent 20 years split between MS and Apple. Some of the best people I ever worked with were in QA. One guy in particular was an extremely talented engineer who simply didn't enjoy the canonical "coding" role; what he did enjoy was finding bugs and breaking things. ;-)
Really? The best people I worked with were never QA.
Moreover, the best QAs would almost always try to be not QA - to shift into a better respected and better paid field.
I wish it werent so (hence my username) but there is a definite class divide between devs and QA and it shows up not just in terms of the pay packets but also who gets the boot in down times and who gets listened to. This definitely affects the quality of people.
I think it's overdue an overhaul much like the sysadmin->devops transition.
We have differing experiences, which shouldn't be surprising. My example explicitly referred to someone who was a good engineer who enjoyed the QA role.
This might have been an Apple/MS thing, but we always had very technical QA people on the dev tools team. For example, the QA lead for the C++ compiler had written their own compiler from scratch and was an amazing contributor.
In the Windows team (back before the test org was decimated) I saw the described "class divide". Anybody who was good enough would switch from SDET to SDE [disclaimer: obviously there were some isolated exceptions]. The test team produced reams of crappy test frameworks, each of which seemed like a "proving project" for its creators to show they could be competent SDEs. After the Great Decimation my dev team took ownership of many such frameworks and it was a total boondoggle; we wasted years trying (and mostly failing) to sort through the crappy test code.
This was all unfortunate, and I agree in principle with having a separate test org, but in Windows the culture unfortunately seemed to be built around testers as second-class software developers.
I spent most of my time working on Visual Studio (in the Boston time frame) so we got to interact with pretty much every team. I absolutely hated interacting with the Windows team. Everything was a fight for no reason.
As I said above, everyone has their own experiences but the QA folks I worked with at MS were fantastic.
Not sure if you're aware but Dave Plumber now has a really good YT channel [0] where he talks about MS back in those days. It's a fun walk down memory lane.
> Really? The best people I worked with were never QA.
> Moreover, the best QAs would almost always try to be not QA - to shift into a better respected and better paid field.
That sort of seems circular. If they're not respected or paid well, of course most of the talented people would not want to remain in QA, and eventually you'd just have mediocre QA. That doesn't really give you any insight into whether high quality QA would be useful though.
(edit: I see now that's basically the point you're trying to make, so I guess we're in agreement)
std::future doesn't give you a state machine. You get the building blocks you have to assemble into one manually. Coroutines give you the same building blocks but let the compiler do the assembly, making the suspension points visible in the source while hiding the mechanical boilerplate.
This is why coroutine-based frameworks (e.g., C++20 coroutines with cppcoro) have largely superseded future-chaining for async state machine work — the generated code is often equivalent, but the source code is dramatically cleaner and closer to the synchronous equivalent.
(me: ex-Visual Studio dev who worked extensively on our C++ coroutine implementation)
It doesn't seem like a clear win to me. The only "assembly" required with std::future is creating the associated promise and using it to signal when that async step is done, and the upside is a nice readable linear flow, as well as ease of integration (just create a thread to run the state machine function if want multiple in parallel).
With the coroutine approach using yield, doesn't that mean the caller needs to decide when to call it again? With the std::future approach where it's event driven by the promise being set when that state/step has completed.
You are describing a single async step, not a state machine. "Create a promise, set it when done", that's one state. A real async state machine has N states with transitions, branching, error handling, and cleanup between them.
> "The only 'assembly' required is creating the associated promise"
Again, that is only true for one step. For a state machine with N states you need explicit state enums or a long chain of .then() continuations. You also need to the manage the shared state across continuations (normally on the heap). You need to manage manual error propagation across each boundary and handle the cancellation tokens.
You only get a "A nice readable linear flow" using std:future when 1) using a blocking .get() on a thread, or 2) .then() chaining, which isn't "nice" by any means.
Lastly, you seem to be conflating a co_yield (generator, pull-based) with co_await (event-driven, push-based). With co_await, the coroutine is resumed by whoever completes the awaitable.
But what do I know... I only worked on implementing coroutines in cl.exe for 4 years. ;-)
I only mentioned co_yield() since that's what the article was (ab)using, although perhaps justifiably so. It seems the coroutine support was added to C++ in a very flexible way, but so low level as to be daunting/inconvenient to use. It needs to have more high level facilities (like Generators) built on top.
What I was thinking of as a state machine with using std::future was a single function state machine, using switch (state) to the state specific dispatch of asynch ops using std::future, wait for completion then select next state.
I don't even know how to respond to that. How in the world are you using C++ professionally if you think coroutines are "daunting"? No one uses C++ for it's "convenience" factor. We use it for the power and control it affords.
> What I was thinking of as a state machine with using std::future was a single function state machine, using switch (state) to the state specific dispatch of asynch ops using std::future, wait for completion then select next state.
Uh huh. What about error propagation and all the other very real issues I mentioned that you are just ignoring? Why not just let the compiler do all the work the way it was spec'ed and implemented?
I get what you’re saying, but you kicked off this thread like an expert — even though you knew you were talking to someone who helped build the very thing you’re critiquing.
It’s pretty clear you’ve never built a production-grade async state machine.
C++ is designed to provide the plumbing, not the kitchen sink. It’s a language for building abstractions, not handing them to you — though in practice, there’s a rich ecosystem if you’d rather reuse than reinvent.
That flexibility comes at the cost of convenience, which is why most new engineers don’t start with C++.
What you call “intimidating,” I call powerful. If coroutines throw you off, you’re probably using the wrong language.
Last thought — when you run into someone who’s built the tools you rely on, ask them questions instead of trying to lecture them. I would have been more than happy to work through a pedagogical solution with you.
Uh huh. The person who gets confused by how co_wait() actually works and thinks that coroutines are "intimidating" wrote frameworks that I would have used to build our C++ compiler. Do you not understand that cl.exe doesn't use external frameworks? lmfao
Um... you might want to look at my profile. In addition to working at MS and Apple for two decades (where I touched everything from firmware, ring-0, and ring-3), I was on the team that created SoftICE [0]: the first commercial ring-0 debugger for Windows. I also created the automated deadlock detector for BoundsChecker [1], which requires an in-depth understanding of operating system internals.
> computer systems whose backend implementations you are blissfully ignorant of
I am extremely confident in my "backend" knowledge (of course, an actual systems engineer would never refer to their work as "backend").
You wrote a "C++ framework" that runs in the "backend" of a "computer system"? Do I have that right? Please let me know what it is so that I can decompile it and see how it was implemented!
I've been working on a utility that lets me "see through" app windows on macOS [1] (I was a dev on Apple's Xcode team and have a strong understanding of how to do this efficiently using private APIs).
I wondered how Claude Code would approach the problem. I fully expected it to do something most human engineers would do: brute-force with ScreenCaptureKit.
It almost instantly figured out that it didn't have to "see through" anything and (correctly) dismissed ScreenCaptureKit due to the performance overhead.
This obviously isn't a "frontier" type problem, but I was impressed that it came up with a novel solution.
Thanks! I've been doing a lot of work on a laptop screen (I normally work on an ultrawide) and got tired of constantly switching between windows to find the information I need.
I've also added the ability to create a picture-in-picture section of any application window, so you can move a window to the background while still seeing its important content.
Was it a novel solution for you or for everyone? Because that's a pretty big difference. A lot stuff novel for me would be something someone had been doing for decades somewhere.
How confident are you that this knowledge was not part of the training data? Was there no stackoverflow questions/replies with it, no tech forum posts, private knowledge bases, etc?
Not trying to diminish its results, just one should always assume that LLMs have a rough memory on pretty much the whole of the internet/human knowledge. Google itself was very impressive back then in how it managed to dig out stuff interesting me (though it's no longer good at finding a single article with almost exact keywords...), and what makes LLMs especially great is that they combine that with some surface level transformation to make that information fit the current, particular need.
Do you think AlphaGo is regurgitating human gameplay? No it’s not: it’s learning an optimal policy based on self play. That is essentially what you’re seeing with agents. People have a very misguided understanding of the training process and the implications of RL in verifiable domains. That’s why coding agents will certainly reach superhuman performance. Straw/steel man depending on what you believe: “But they won’t be able to understand systems! But a good spec IS programming!” also a bad take: agents absolutely can interact with humans, interpret vague deseridata, fill in the gaps, ask for direction. You are not going to need to write a spec the same way you need to today. It will be exactly like interacting with a very good programmer in EVERY sense of the word
How does alphago come into picture? It works in a completely different way all together.
I'm not saying that LLMs can't solve new-ish problems, not part of the training data, but they sure as hell not got some Apple-specific library call from a divine revelation.
AlphaGo comes into the picture to explain that in fact coding agents in verifiable domains are absolutely trained in very similar ways.
It’s not magic they can’t access information that’s not available but they are not regurgitating or interpolating training data. That’s not what I’m saying. I’m saying: there is a misconception stemming from a limited understanding of how coding agents are trained that they somehow are limited by what’s in the training data or poorly interpolating that space. This may be true for some domains but not for coding or mathematics. AlphaGo is the right mental model here: RL in verifiable domains means your gradient steps are taking you in directions that are not limited by the quality or content of the training data that is used only because starting from scratch using RL is very inefficient. Human training data gives the models a more efficient starting point for RL.
Because you can't control what the content server is doing. SCK doesn't care if you only need a small section of a window: it performs multiple full window memory copies that aren't a problem for normal screen recorders... but for a utility like mine, the user needs to see the updated content in milliseconds.
Also, as I mentioned above, when using SCK, the user cannot minimize or maximize any "watched" window, which is, in most cases, a deal-breaker.
My solution runs at under 2% cpu utilization because I don't have to first receive the full window content. SCK was not designed for this use case at all.
It's been a while since I looked at this but I'm not entirely sure I agree with this. ScreenCaptureKit vends IOSurfaces which don't have copies besides the one that happens to fill the buffer during rendering. I'm not entirely sure what other options you have that are better besides maybe portal views.
I worked on the AVC team and built the original SCK Instruments plugin for performance monitoring. I'm assuming you aren't talking about ring-0 (which is where the performance hit occurs). That said, if you want your users to be able to minimize/maximize any "watched" window, ScreenCaptureKit is a non-starter. The OBS team has been asking Apple to remove that restriction for years.
Here's a more real-world scenario [0] where Seymour has to handle more than a single window. I can cycle through the entire z-order at 60 fps while capturing all of the content windows. In fact, Seymore can display the contents of minimized windows, which the content server doesn't even support natively. BTW, this quick demo was done using a debug build. The release build can run at < 4% cpu utilization with a dozen windows active and has full multi-monitor and multi-space support. Also, remember that SCK pays no attention to windows that are hidden; something that Seymour has to do constantly.
Here's something else you can't do with SCK: picture-in-picture windows [1] that can exist even when the source window is hidden. This is super helpful when watching builds or app logs on larger monitors. No more command+tabbing to babysit things.
Well, I'm not going to share either solution as this is actually a pretty useful utility that I plan on releasing, but the short answer is: 1) don't use ScreenCaptureKit, and 2) take advantage of what CGWindowListCreateImage() offers through the content server. This is a simple IPC mechanism that does not trigger all the SKC limitations (i.e., no multi-space or multi-desktop support). In fact, when using SKC, the user cannot even minimize the "watched" window.
Claude realized those issues right from the start.
One of the trickiest parts is tracking the window content while the window is moving - the content server doesn't, natively, provide that information.
No it didn't. Like I said... it may have gotten something that worked but there is no way Claude got it to work while supporting multi-spaces, multi-desktops, and using under 2% cpu utilization. My solution can display app window content even when those windows are minimized, which is not something the content server supports.
My point was that Claude realized all the SKC problems and came up with a solution that 99% of macOS devs wouldn't even know existed.
> it may have gotten something that worked but there is no way Claude got it to work while supporting multi-spaces, multi-desktops, and using under 2% cpu utilization.
Maybe, but that's the magic of LLMs - they can now one-shot or few-shot (N<10) you something good enough for a specific user. Like, not supporting multi-desktops is fine if one doesn't use them (and if that changes, few more prompts about this particular issue - now the user actually knows specifically what they need - should close the gap).
Do you believe my brief overview of the problem will help Claude identify the specific undocumented functions required for my solution? Is that how you think data gets fed back into models during training?
Yes. I don't think you appreciate just how much information your comments provide. You just told us (and Claude) what the interesting problems are, and confirmed both the existence of relevant undocumented functions, and that they are the right solution to those problems. What you didn't flag as interesting, and possible challenges you did not mention (such as these APIs being flaky, or restricted to Apple first-party use, or such) is even more telling.
Most hard problems are hard because of huge uncertainty around what's possible and how to get there. It's true for LLMs as much as it is for humans (and for the same reasons). Here, you gave solid answers to both, all but spelling out the solution.
ETA:
> Is that how you think data gets fed back into models during training?
No, one comment chain on a niche site is not enough.
It is, however, how the data gets fed into prompt, whether by user or autonomously (e.g. RAG).
> Yes. I don't think you appreciate just how much information your comments provide
Lol... no. You don't know how I solved the problem and you just read everything that Claude did.
Absolutely nothing in the key part of my solution uses a single public API (and there are thousands). And you think that Claude can just "figure that out" when my HK comments gets fed back in during training?
I sincerely wish we'd see less /r/technology ridiculousness on HN.
I wonder how many 'ideas guys' will now think that with LLMs they can keep their precious to themselves while at the same bragging about them in online fora. Before they needed those pesky programmers negotiating for a slice of the pie, but this time it will be different.
Next up: copyright protection and/or patents on prompts. Mark my words.
I'm pretty sure a large fraction of the vibecoded stuff out there is from the "ideas guys." This time will be different because they'll find out very quickly whether their ideas are worth anything. The term "slop" substantially applies to the ideas themselves.
I don't think there will be copyright or patents on prompts per se, but I do think patents will become a lot more popular. With AI rewriting entire projects and products from scratch, copyright for software is meaningless, so patents are one of the very few moats left. Probably the only moat for the little guys.
Ex-Apple dev here. Just for clarity: you can't use local snapshot restore in macOS Recovery to downgrade to an entirely earlier macOS version (e.g., stable to beta or major version back). It rolls back data and user state within the same OS version due to Signed System Volume (SSV) protections.
I've seen people get pretty confused when trying to roll back this way.
Not sure I'd call myself a fan, but I was an engineer on the Xcode team for a decade. The answer to your question about coupling is "ease of testing and coherence".
Prior to Apple, I was a senior engineer on the dev tools team at Microsoft. We did the same exact thing wrt full-release testing and vendor hardware.
I'm not saying I agree with the way either company handles coupling, lock-in, etc. but if you don't think that the Windows UI is coupled to ring-0 you don't understand how it works.
> How do you know this place even exists without any information?
You want to find an antique book store in another state. How do you find it? You search the web. And what information bubbles to the top of the search results? Answer: businesses with websites.
If you are a business owner, you will lose customers without a website, because that is how most people will find you.
If I'm looking for a physical place I usually just look at Google maps. "Minneapolis antique bookstores." I'll look at pics, see if the vibe is cool, etc. Relying on Google SEO is a recipe for disaster in my experience because there's no guarantee that the bookstore is even in or near Minneapolis. Other people probably browse the web differently though.
I honestly would not expect an antique bookstore to have a website, unless they let you buy their books online.
> If I'm looking for a physical place I usually just look at Google maps.
Ah yes... I'm sure that is what 99% of people do. /s
You don't like reality... and that's fine. You do you. But, most businesses do need a web presence if they want be be discovered by the majority of potential customers.
I'd literally bet my house that most people do a simple google search. No one goes to google maps as the first option when trying to find things unless they are in their car. (Well, except for you, of course)
In 2026, companies who want their customers to easily find them will have some type of web presence. I'm sorry that it is such a hardship for you.
I do a simple Google search, or whatever search happens to be default on the browser I happen to be using. Then I click on Google maps, or the platforms equivalent. I'm not gonna waste time on the sponsored results that may not even be nearby what I'm looking for.
This seems odd to me. I have never seen obfuscation techniques in first party Apple software - certainly not in Espresso or ANECompiler and overall nowhere at all except in media DRM components (FairPlay).
Apple are really the major OS company _without_ widespread use of a first party obfuscator; Microsoft have WarBird and Google have PairIP.
> Apple are really the major OS company _without_ widespread use of a first party obfuscator
You might want to look into techniques like control-flow flattening, mixed boolean–arithmetic transformations, opaque predicates, and dead code injection — Apple uses all of these. The absence of a publicly named obfuscator doesn’t mean Apple doesn’t apply these methods (at least during my time there).
Ever wonder why Apple stopped shipping system frameworks as individual .dylib files? Here’s a hint: early extraction tools couldn’t preserve selector information when pulling libraries from the shared cache, which made the resulting decompiled pseudocode unreadable.
I'm very familiar with CFG flattening and other obfuscation techniques, thanks.
That's interesting; I suppose I must not have touched the parts of the platform that use them, and I've touched a fair amount of the platform.
Again, I _have_ seen plenty of obfuscation techniques in DRM/FairPlay, but otherwise I have not, and again, I am entirely sure the ANE toolchain from CoreML down through Espresso and into AppleNeuralEngine.framework definitely does not employ anything I would call an obfuscation technique.
> Ever wonder why Apple stopped shipping system frameworks as individual .dylib files?
If the dyld cache was supposed to be an obfuscation tool, shipping the tools for it as open source was certainly... a choice. Also, the reason early tools couldn't preserve selector information was selector uniqueing, which was an obvious and dramatic performance improvement and explained fairly openly, for example - http://www.sealiesoftware.com/blog/archive/2009/09/01/objc_e... . If it was intended to be an obfuscation tool, again it was sort of a baffling one, and I just don't think this is true - everything about the dyld cache looks like a performance optimization and nothing about it looks like an obfuscator.
I’m still relatively new to HN, but I continue to find it fascinating when people share their perspectives on how things work internally. Before joining Apple, I was a senior engineer on the Visual Studio team at Microsoft, and it's amazing how often I bump into people who hold very strong yet incorrect assumptions about how systems are built and maintained.
> I suppose I must not have touched the parts of the platform that use them
It’s understandable not to have direct exposure to every component, given that a complete macOS build and its associated applications encompass tens of millions of lines of code. /s
That said, there’s an important distinction between making systems challenging for casual hackers to analyze and the much harder (if not impossible) goal of preventing skilled researchers from discovering how something works.
> Also, the reason early tools couldn't preserve selector information was selector uniqueing
That isn't even remotely how we were making things difficult back then.
I led the SGX team at Intel for a while, working on in-memory, homomorphic encryption. In that case, the encryption couldn’t be broken through software because the keys were physically fused into the CPU. Yet, a company in China ultimately managed to extract the keys by using lasers to remove layers of the CPU die until they could read the fuses directly.
I’ll wrap up by noting that Apple invests extraordinary effort into making the critical components exceptionally difficult to reverse-engineer. As with good obfuscation—much like good design or craftsmanship—the best work often goes unnoticed precisely because it’s done so well.
I'm done here - you go on believing whatever it is you believe...
I'm thoroughly enjoying this thread by the way, between someone who is clearly informed and educated in platform research, and pretty enthusiastic and interested in the field, and yourself - an deeply experienced engineer with truly novel contributions to the conversation that we don't often see.
Looking very forward to more of your insight/comments. Hopefully your NDA has expired on some topic that you can share in detail!
Thank you for your comment. I started this thread just as a simple "job well done" to the authors. I didn't expect to be told that my work doesn't exist. ;-)
No one ever notices plastic surgery when it is done well. The same can be true for obfuscation. But, as I indicated, no amount of obfuscation is foolproof when dealing with experienced, well-funded attackers. The best you can do is make their task annoying.
I was mostly joking, I am not from the US and not skilled enough to be considered for bothering with creating a visa for me when there are thousands of developers much more fit for this in the USA. But it is neat to see that the requirements are not as intense as I would've expected
I've always felt a little odd saying, "Back in my day we had to understand the cpu, registers, etc." It's a true statement, but doesn't help in any way. Is that stuff still worth knowing, IMHO? Yes. Can you create incredibly useful code without that knowledge today? Absolutely.
There are some people who still know these things, and are able to use LLMs far more effectively than those who do not.
I've seen the following prediction by a few people and am starting to agree with it: software development (and possibly most knowledge work) will become like farming. A relatively smaller number of people will do with large machines what previously took armies of people. There will always be some people exploring the cutting edge of thought, and feeding their insights into the machine, just how I image there are biochemists and soil biology experts who produce knowledge to inform decisions made by the people running large farming operations.
I imagine this will lead to profound shifts in the world that we can hardly predict. If we don't blow ourselves up, perhaps space exploration and colonization will become possible.
I think that tt's more likely at this point that we turn the depleting quantities of exploitable resources on this planet into more and more data centers and squander any remaining opportunity at space exploration/colonization at scale.
> Can you create incredibly useful code without that knowledge today?
You could do that without that knowledge back in the day too, we had languages that were higher level than assembler for forever.
It's just that the range of knowledge needed to maximize machine usage is far smaller now. Before you had to know how to write a ton of optimizations, nowadays you have to know how to write your code so the compiler have easy job optimizing it.
Before you had to manage the memory accesses, nowadays making sure you're not jumping actross memory too much and being aware how cache works is enough
Or more so - machines have gotten so fast, with so much disk and memory.. that people can ship slopware filled with bloatware and the UX is almost as responsive as Windows 3.1 was
I don't think it's odd. Sacrificing deep understanding, and delegating that responsibility to others is risky. In more concrete terms, if your livelihood depends on application development, you have concrete dependencies on platforms, frameworks, compilers, operating systems, and other abstractions that without which you might not be able to perform your job.
Fewer abstractions, deeper understanding, fewer dependencies on others. These concepts show up over and over and not just in software. It's about safety.
reply