Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The reason Arc hurts is that it's global and atomic. Multiple cores competing to inc/dec a shared reference count is pretty much a worst case scenario for modern CPUs.

Under modern garbage collection, there's very little need to coordinate across cores, just some barriers every now and then to mark things safe.



True, but impact depends on both total core count you're using in the process and what you're doing with Arc. I'm imagining Arc<RequestState> with low concurrency on each RequestState so impact of cache line bouncing should be negligible.

If you're talking about Arc<GlobalStuff>, I'd probably use Box::leak instead. There's also a few crates that do epoch gc. I haven't really written super high request rate multicore stuff in Rust, but I have in C++, and there for global-ish occasionally updated config stuff, I used a epoch gc implementation on top of Linux rseq to totally eliminate the effect you're describing.


The classic case where Arc hurts and GC really shines is Arc<Config>.

Arc makes otherwise read-only activity write.


How many independent things do you need to access the config? A dozen? Then there is a dozen calls to inc the RC at startup, but there is no later cost to read from object behind Arc. So you may read your config a million times and Arc changes nothing. While GC will have to scan the Config structure every time a full GC is needed, so there is an added cost.


The C++ server I mentioned had an "experiment config" that would be used to roll out changes (user-visible features, backend code path changes for data store migrations, etc.) incrementally, and it picked up config changes without a restart. Each request needed to grab the current experiment config once and hold onto it for a consistent view. This server reserved ~16 cores and had pretty high request rates, so Arc<Config> would indeed hit the sort of problem yencabulator is describing. And I imagine it'd get pretty bad if each server crossed NUMA node boundaries (although I recommend avoiding that if you can regardless).

In this case, the Linux rseq-based epoch GC worked perfectly. It is technically a form of garbage collection, but it's not all-or-nothing like Java-style or Bohm GC; it's just a library you use for a few highly used, infrequently updated data structures.

btw, Arc<Config> doesn't really seem relevant to the discussion of scoped concurrency. Scoped concurrency can often replace or reduce the need for Arc<RequestState> but not Arc<Config>.


I'm not really understanding why you need to clone the Arc so many times. At most it seems that you'd do so once per request.


"So many times" = one increment + one decrement per request = 100k/sec maybe, bouncing cache lines across 16 cores. This is suboptimal but not world-ending.


Definitely suboptimal, no question.


In the trouble scenario, config may change at any time. A getter function has to return a freshly-incremented Arc<Config>, which lives for roughly one request.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: