Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This seems kind of crazy. If LLMs are so stunningly good at finding vulnerabilities in code, then shouldn't the solution be to run an LLM against your code after you commit, and before you release it? Then you basically have pentesting harnesses all to yourself before going public. If an LLM can't find any flaws, then you are good to release that code.

A few years ago, I invoked Linus's Law in a classroom, and I was roundly debunked. Isn't it a shame that it's basically been fulfilled now with LLMs?

https://en.wikipedia.org/wiki/Linus%27s_law



After a release, attackers have effectively infinite time to throw an LLM against every line of your code - an LLM that only gets smarter and cheaper to run as time passes. In order to feel secure you’d need to do all the work you’d imagine an attacker would ever do, for every single release you ship.


> attackers have effectively infinite time

No, attackers are also rational economical actors. They don't randomly attack any software just for the aesthetics beauty of the process. They attack for bounty, for fame, for national interest, etc. No matter the reason it's not random and thus they DO have a budget, both in time and money. They attack THIS project versus another project because it's interesting to them. If it's not, they might move to another project but they certainly won't spend infinite time precisely because they don't have infinite resources. IMHO it's much more interesting to consider the realistic arm race then theoretical scenarii that never take place.


The amount of time they will invest is proportional to how much usage / how high value the target is. If your release is used by no one then no one is going to attack it, but it didn't matter anyways.


The first few times it's going to be expensive, but once everyone level sets with intense scans of their codebases, "every single release" is actually not that big a deal, since you are not likely to be completely rebuilding your codebase every release


You still have to account for the non-deterministic behavior of an LLM, when do you know you have exhausted its possible outcomes for any given piece of code?


I'm not sure. An innocuous one line change like "bump version" possibly adds a million new lines of code.


This assumes that the relationship between "LLM tokens spent" and "vulnerabilities found" doesn't plateau, though.


But so do you and all your users what’s your point?


No? I can't go out and retroactively fix a bug in a version my users are using? I need to release a new version?


As LLMs improve and adoption grows, maintaining a FOSS project is becoming more complex and more expensive in terms of time and manpower. That part is easy to understand.

It is also become a trend that LLM-assisted users are generating more low-quality issues, dubious security reports, and noisy PRs, to the point where keeping the whole stack open source no longer feels worth it. Even if the real reason is monetization rather than security, I can still understand the decision.

I suspect we will see more of this from commercial products built around a FOSS core. The other failure mode is that maintainers stop treating security disclosures as something special and just handle them like ordinary bugs, as with libxml2. In that sense, Chromium moving toward a Rust-based XML library is also an interesting development.


Just use AI to fight AI, that's the only sensible way we can keep up. So if you're low-quality PRs, reports etc, have LLMs filter them out. Like how once upon a time we used to drown in email spam but it's now mostly a non-issue thanks to intelligent spam filters, the same needs to happen for opensource projects. Use AI to fight AI.


In other words, have more money to pay than your enemy.

This game will end horribly.


LLMs really are stunningly good at finding vulnerabilities in code, which is why, with closed-source code, you can and probably will use them to make your code as secure as possible.

But you won't keep the doors open for others to use them against it.

So it is, unfortunately, understandable in a way...


I'm not a security expert but can't close source applications be vulnerable and exploited too? I feel like using close source as a defense is just giving you a false sense of security.


Finding a vulnerability in a black box is drastically different from finding one in a white box. This isn’t about whether there is a vulnerability or not, but about the likelihood of it being found.


No it isn't. There is a tooling gap, and there is a skill gap, but both of those are being rapidly closed by both open and closed source projects.

LLMs, and tools built to use them, are violating a lot of assumptions these days.


It's a meaningful difference for SaaS. Most likely an attacker doesn't have access to your running binary let alone source code, and if they probe it like a pentester would it will be noisy and blocked/flagged by your WAF.


What is being phrased as obscurity is one of the approaches to security as long as you are able to keep the code safe. Your passwords, security keys are just random combination of strings, the fact that they are obscure from everyone is what provides you the security


Decompilation and you are back to the level of security you started with. OpenSSH is open for a good reason. Please acknowledge your error. Are you AI?


How do you decompile a SaaS? They're a SaaS.

OTOH, their position seems to be "many LLMs make shallow bugs" is unhelpful; same as many eyes make shallow bugs considered unhelpful.

What seems genuinely needed by the open source economy to both surface these latent vulns and tamp down finding-slop is a new https://bughook.github.com/your/repo/ that these big LLMs (Mythos, etc.) support. Mythos understands if it's been used to find an vuln, and back end auto-reports verified findings the git service can feed to a Dependabot type tool.

Even better, price up Mythos to cover running a background verifier that gets the project, revalidates the issue, before that bughook.

Meanwhile, train it on these findings, so its future self doesn't create them.


Delaying attacks is a form of valid security.


You don't need the source, the LLM has the source, it is called the binary.


LLM like humans can find vulnerabilities in black boxes. We already established 30 years ago that open source is usually more secure than closed source and that security by obscurity doesn't work.


I suspect that AI is a convenient excuse to go closed source. They have probably wanted to do that for years after leaning more commercial.


Every change would introduce the possibility of a vulnerability being added to the system and one would need to run the LLM scan across the entire code base. It gets very costly in a environment where you are doing regular commits. Companies like Github already provide scanning tools for static analysis and the cost is already high for them.


Might lead to a move away from continuous delivery back towards batched releases.


That’s a non-trivial cost for commonly severely underfunded open source projects


Cal.com is not a severely underfunded project, it raised around $32M of VC money.


It's not a "project" though; the business Cal.com Inc raised that VC money. Their open source repo did not raise the money.

Did they ever promise to keep their codebase FOSS forever, in a way that differs from what they're already doing over at cal.diy? If not, I don't see why it would be reasonable to expect them to spend a huge amount of money re-scanning on every single commit/deploy in order to keep their non-"DIY" product open source.


Attackers only need LLMs to be good at randomly finding one vulnerability, whereas service providers need them to be good at finding all such vulnerabilities.


Write simple code. Do what you said, which is a very good idea. Test LLM security against the compiler too.


It's entirely possible to address all the LLM-found issues and get an "all green" response, and have an attacker still find issues that your LLM did not. Either they used a different model, a different prompt, or spent more money than you did.

It's not a symmetric game, either. On defense, you have to get lucky every time - the attacker only has to get lucky once.


> It's not a symmetric game, either. On defense, you have to get lucky every time - the attacker only has to get lucky once.

This! I love OSS but this argument seems to get overlooked in most of the comments here.


I mean, you should definitely have _some_ level of audit by LLMs before you ship, as part of the general PR process.

But you might need thousands of sessions to uncover some vulnerabilities, and you don’t want to stop shipping changes because the security checks are taking hours to run




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: