Hacker Newsnew | past | comments | ask | show | jobs | submit | boris's commentslogin

GCC (libstdc++) as all other major C++ runtimes (libc++, MSVC) implements the small object optimization for std::function where a small enough callable is stored directly in std::function's state instead of on the heap. Across these implementations, you can reply on being able to capture two pointers without a dynamic allocation.


You would think so, but it actually doesn't. last time I checked, libstdc++ could only optimize std::bind closures. A trivial test with a stateless lambda shows this is still the case in GCC14 and 15. In fact I can't even seem to trigger the library optimization with bind.

Differently from GCC14, GCC15 itself does seem to be able to optimize the allocation (and the whole std::function) in trivial cases though (independently of what the library does).


The strangest thing about Flow is that its compiler is implemented in C#. So if you decide to use it in your C++ codebase, you now have a C#/.Net dependency, at least at build time.


It’s also funny because it’s a small, incomplete, incompatible subset of c++… seems like a perfect LLVM / clang rewriter case too, it would be easy to convert and be pure c++. Hell even a clang plugin to put the compile time into one process wouldn’t be awful. But i wonder looking at the rewrites if there’s not a terribly janky way to not need a compiler, if at some runtime cost of contextual control flow info.


Not even that, this should more or less be directly translateable to C++ 20 coroutines.


Also of course years older than them.


I wonder why that decision was made. I know why I, a C# developer, would make that decision, but why Apple?


The original developers (before Apple bought the company) used Visual Studio on Windows


This entire codebase was acquired by apple in a state of substantial completion and since then relatively little has changed.


Someone knew C# and was good at parsers, would be my guess. It could have just as easily been Scala or something else.


I believe the point is if something is UB, like NULL pointer dereference, then the compiler can assume it can't happen and eliminate some other code paths based on that. And that, in turn, could be exploitable.


Yes, that part was clear. The certainty of a vulnerability is worse than the possibility of a vulnerability, and most UB does not in fact produce vulnerabilities.


Most UB results in miscompilation of intended code by definition. Whether or not they produce vulnerabilities is really hard to say given the difficulty in finding them and that you’d have to read the machine code carefully to spot the issue and in c/c++ that’s basically anywhere in the codebase.

You stated explicitly it isn’t but the compiler optimizing away null pointer checks or otherwise exploiting accidental UB literally is a thing that’s come up several times for known security vulnerabilities. It’s probability of incidence is less than just crashing in your experience but that doesn’t necessarily mean it’s not exploitable either - could just mean it takes a more targeted attack to exploit and thus your Baysian prior for exploitability is incorrectly trained.


> by definition

But not in reality. For example a signed overflow is most likely (but not always) compiled in a way that wraps, which is expected. A null pointer dereference is most likely (but not always) compiled in a way that segfaults, which is expected. A slightly less usual thing is that a loop is turned into an infinite one or an overflow check is elided. An extremely unusual thing and unexpected is that signed overflow directly causes your x64 program to crash. A thing that never happens is that your demons fly out of your nose.

You can say "that's not expected because by definition you can't expect anything from undefined behaviour" but then you're merely playing a semantic game. You're also wrong, because I do expect that. You're also wrong, because undefined behaviour is still defined to not shoot demons out of your nose - that is a common misconception.

Undefined behaviour means the language specification makes no promises, but there are still other layers involved, which can make relevant promises. For example, my computer manufacturer promised not to put demon-nose hardware in my computer, therefore the compiler simply can't do that. And the x64 architecture does not trap on overflow, and while a compiler could add overflow traps, compiler writers are lazy like the rest of us and usually don't. And Linux forbids mapping the zero page.


What is the filesystem story in OpenBSD? Anything CoW/snapshot'able on the horizon?


As I understand it, it's supported by muxfs.

https://hackmd.io/EkYP__XaQRebEZDokSQNjA


What would be the SQLite's equivalent to indexing starting from 1, not 0? Off the top of my head I can't think of anything that would go so much against the grain.


For me it's case insensitive LIKE.


column types are more like guidelines than rules


> We have been optimized very hard by evolution to be good at running, so there shouldn’t be any “easy” technologies that would make us dramatically faster or more efficient.

I wonder if these new shoes have the same affect on natural (i.e., non-paved) surfaces? Plus, they all look quite high off the ground (probably all those plates and foam need space) and that doesn't help with stability when running over rocks, etc.


Yes. The top marathon racing shoes are optimized for road-running & hard surfaces like asphalt. Definitely not good for trails. They are indeed very tall (though there's limits for official competitions.) Excellent lateral stability is essentially a non-goal so they are not a good choice for volleyball or tennis either. So yeah, we run in a very different world than the one where our ancestors evolved...


There is a parallel with database transactions: it's great if you can do everything in a single database/transaction (atomic monorepo commit). But that only scales so far (on both dimensions: single database and single transaction). You can try distributed transactions (multiple coordinated commits) but that also has limits. The next step is eventual consistency, which would be equivalent to releasing a new version of the component while preserving the old one and with dependents eventually migrating to it at their own pace.


This is why in microservice architectures you try to have data stores encapsulated via each service. So the APIs can support some kind backward compatibility path as you roll out changes to clients that interact with the service (presuming the db migration has public API implications).


Doesn't that rely on the code being able to work in both states?

I mean, to use a different metaphor, an incremental rollout is all fine and dandy until the old code discovers that it cannot work with the state generated by the new code.


Yes, but depending on the code you’re working on that may be the case anyway even with a monorepo.

For example a web api that talks to a database but is deployed with more than one instance that will get rolling updates to the new version to avoid any downtime. There will be overlapping requests to both old and new code at the same time.

Or if you want to do a trial deployment of the new version to 10% of traffic for some period of time.

Or if it’s a mobile or desktop installed app that talks to a server where you have to handle people using the previous version well after you’ve rolled out an update.


Yes, it does.


Yes, I've seen this one in our logs. Quite obnoxious, but at least it identifies itself as a bot and, at least in our case (cgit host), does not generate much traffic. The bulk of our traffic comes from bots that pretend to be real browsers and that use a large number of IP addresses (mostly from Brazil and Asia in our case).

I've been playing cat and mouse trying to block them for the past week and here are a couple of observations/ideas, in case this is helpful to someone:

* As mentioned above, the bulk of the traffic comes from a large number of IPs, each issuing only a few requests a day, and they pretend to be real UAs.

* Most of them don't bother sending the referrer URL, but not all (some bots from Huawei Cloud do, but they currently don't generate much traffic).

* The first thing I tried was to throttle bandwidth for URLs that contain id= (which on a cgit instance generate the bulk of the bot traffic). So I set the bandwidth to 1Kb/s and thought surely most of the bots will not be willing to wait for 10-20s to download the page. Surprise: they didn't care. They just waited and kept coming back.

* BTW, they also used keep alive connections if ones were offered. So another thing I did was disable keep alive for the /cgit/ locations. Failed that enough bots would routinely hog up all the available connections.

* My current solution is to deny requests for all URLs containing id= unless they also contain the `notbot` parameter in the query string (and which I suggest legitimate users add in the custom error message for 403). I also currently only do this if the referrer is not present but I may have to change that if the bots adapt. Overall, this helped with the load and freed up connections to legitimate users, but the bots didn't go away. They still request, get 403, but keep coming back.

My conclusion from this experience is that you really only have two options: either do something ad hoc, very specific to your site (like the notbot in query string) that whoever runs the bots won't bother adapting to or you have to employ someone with enough resources (like Cloudflare) to fight them for you. Using some "standard" solution (like rate limit, Anubis, etc) is not going to work -- they have enough resources to eat up the cost and/or adapt.


Pick an obscure UA substring like MSIE 3.0 or HP-UX. Preemptively 403 these User Agents, (you'll create your own list). Later in the week you can circle back and distill these 403s down to problematic ASNs. Whack moles as necessary.


I've tracked bots that were stuck in a loop no legitimate user would ever get stuck in (basically by circularly following links long past the point of any results). I also decided to filter out what were bots for sure, and it was over a million unique IPs.


I (of course) use the djbwares descendent of Bernstein publicfile. I added a static GEMINI UCSPI-SSL tool to it a while back. One of the ideas that I took from the GEMINI specification and then applied to Bernstein's HTTP server was the prohibition on fragments in request URLs (which the Bernstein original allowed), which I extended to a prohibition on query parameters as well (which the Bernstein original also allowed) in both GEMINI and HTTP.

* https://geminiprotocol.net/docs/protocol-specification.gmi#r...

The reasoning for disallowing them in GEMINI pretty much applies to static HTTP service (which is what publicfile provides) as it does to static GEMINI service. They moreover did not actually work in Bernstein publicfile unless a site administrator went to extraordinary lengths to create multiple oddly-named filenames (non-trivial to handle from a shell on a Unix or Linux-based system, because of the metacharacter) with every possible combination of query parameters, all naming the same file.

* https://jdebp.uk/Softwares/djbwares/guide/publicfile-securit...

* https://jdebp.uk/Softwares/djbwares/guide/commands/httpd.xml

* https://jdebp.uk/Softwares/djbwares/guide/commands/geminid.x...

Before I introduced this, attempted (and doomed to fail) exploits against weak CGI and PHP scripts were a large fraction of all of the file not found errors that httpd had been logging. These things were getting as far as hitting the filesystem and doing namei lookups. After I introduced this, they are rejected earlier in the transaction, without hitting the filesystem, when the requested URL is decomposed into its constituent parts.

Bernstein publicfile is rather late to this party, as there are over 2 decades of books on the subject of static sites versus dynamic sites (although in fairness it does pre-date all of them). But I can report that the wisdom when it comes to queries holds up even today, in 2025, and if anything a stronger position can be taken on them now.

To those running static sites, I recommend taking this good idea from GEMINI and applying it to query parameters as well.

Unless you are brave enough to actually attempt to provide query parameter support with static site tooling. (-:


The main reason you would attach a database and then jump through hoops like qualifying tables is to have transactions cover all the attached databases. If you don't need that, then you can just open separate connections to each database without needing to jump through any hoops. So the fact that WAL does not provide that is a big drawback.


> And because Electric syncs every change granularly, you are certain that the state of your local database is exactly the same as the server's.

I don't see how this certainty follows from "granularity" (whatever that means in this context). I believe to have such a certainty one would need the synchronization to happen within a single transaction that spans both client and server databases.


Correct - granular syncing alone doesn't guarantee consistency; you'd need either a distributed transaction protocol or a conflict resolution strategy with eventual consistency semantics.


I would say there is no certainty with eventual consistency, only hope.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: