The reason Arc hurts is that it's global and atomic. Multiple cores competing to...

scottlamb · on Sept 18, 2023

True, but impact depends on both total core count you're using in the process and what you're doing with Arc. I'm imagining Arc<RequestState> with low concurrency on each RequestState so impact of cache line bouncing should be negligible.

If you're talking about Arc<GlobalStuff>, I'd probably use Box::leak instead. There's also a few crates that do epoch gc. I haven't really written super high request rate multicore stuff in Rust, but I have in C++, and there for global-ish occasionally updated config stuff, I used a epoch gc implementation on top of Linux rseq to totally eliminate the effect you're describing.

yencabulator · on Sept 18, 2023

The classic case where Arc hurts and GC really shines is Arc<Config>.

Arc makes otherwise read-only activity write.

pkolaczk · on Sept 18, 2023

How many independent things do you need to access the config? A dozen? Then there is a dozen calls to inc the RC at startup, but there is no later cost to read from object behind Arc. So you may read your config a million times and Arc changes nothing. While GC will have to scan the Config structure every time a full GC is needed, so there is an added cost.

scottlamb · on Sept 18, 2023

The C++ server I mentioned had an "experiment config" that would be used to roll out changes (user-visible features, backend code path changes for data store migrations, etc.) incrementally, and it picked up config changes without a restart. Each request needed to grab the current experiment config once and hold onto it for a consistent view. This server reserved ~16 cores and had pretty high request rates, so Arc<Config> would indeed hit the sort of problem yencabulator is describing. And I imagine it'd get pretty bad if each server crossed NUMA node boundaries (although I recommend avoiding that if you can regardless).

In this case, the Linux rseq-based epoch GC worked perfectly. It is technically a form of garbage collection, but it's not all-or-nothing like Java-style or Bohm GC; it's just a library you use for a few highly used, infrequently updated data structures.

btw, Arc<Config> doesn't really seem relevant to the discussion of scoped concurrency. Scoped concurrency can often replace or reduce the need for Arc<RequestState> but not Arc<Config>.

insanitybit · on Sept 18, 2023

I'm not really understanding why you need to clone the Arc so many times. At most it seems that you'd do so once per request.

scottlamb · on Sept 18, 2023

"So many times" = one increment + one decrement per request = 100k/sec maybe, bouncing cache lines across 16 cores. This is suboptimal but not world-ending.

insanitybit · on Sept 18, 2023

Definitely suboptimal, no question.

yencabulator · on Sept 18, 2023

In the trouble scenario, config may change at any time. A getter function has to return a freshly-incremented Arc<Config>, which lives for roughly one request.