lsr0's comments

lsr0 · on July 5, 2020

These results aren't reproducible when both are compiled with -O3 (similar results with -O2):

  ojc_parse_str    1000000 entries in  709.300 msecs. ( 1409 iterations/msec)
  simdjson_parse   1000000 entries in  450.724 msecs. ( 2218 iterations/msec)

In those results simdjson is roughly 60% faster.

Telling simdjson we have aligned input gives us further improvement:

  simdjson_parse   1000000 entries in  369.234 msecs. ( 2708 iterations/msec)

About 90% faster. Example changes:

   simd_parse(const char *str, int64_t iter) {
       simdjson::dom::parser parser;
       int64_t   dt;
  +    auto   padded = simdjson::padded_string(std::string_view(str));
       int64_t   start = clock_micro();
   
       for (int i = iter; 0 < i; i--) {
  -        simdjson::dom::element doc = parser.parse(str, strlen(str));
  +        simdjson::dom::element doc = parser.parse(padded);
       }
       dt = clock_micro() - start;

lsr0 · on Feb 2, 2016

That seems to be the case, I do wish they'd made this requirement clear up front. There is no getting around sychronised quiescence being a blocking event, but in this case they essentially hope that either 1) you're already using a kind of scatter gather thread model like the mentioned game example - which implies an iterative discrete world, or 2) the set of threads (or tasks) interacting with the collection is simply bounded and sychronised, and/or 3) the performance is still better in aggregate even with infrequent world blocking events.

tlipcon · on Feb 2, 2016

Typically QSBR algorithms don't require blocking the world, or even blocking any single thread. They just require each thread to periodically check in and run a bounded amount of code which amounts to "hey, I'm not currently looking at the map".

Some other background collector thread (which is going to actually delete removed objects) just has to wait until it sees every mutator thread cross a safepoint, at which point it knows that none of those threads could be hanging onto references that have been unlinked from the data structure.

I'd recommend reading some surveys of RCU and SMR algorithms if this stuff is interesting to you.

lsr0 · on July 12, 2014

I used to work in the same industry. We used linux and gcc, so we could, and did, produce fully deterministic builds. Actually the output was fully deterministic disk images.

I did one iteration of the build system, mostly making it such that any host could build it deterministically. This was years ago so it was just chroot that started with a skeleton + GCC and procedurally built the things it needed to build the outputs. Was fairly straight forward, just an extremely short patch here and there, a 1000 line Xorg Makefile for staging Xorg builds. If I was doing it again I'd consider reusing a package manager, but each components Makefile was pretty concise. My trusty sidekick was a script that xxd'd two files into pipes that it opened using vimdiff.

Build took an hour or so, however.

lsr0 · on April 7, 2014

Working on it. Thanks for your feedback.

(Disclaimer: Skype Engineer)

Nemcue · on April 7, 2014

Neat! Then perhaps you can provide some insight into why Skype ditched peer-to-peer in favour of centralisation?

stavros · on April 7, 2014

There was a pretty good writeup on that. Short answer: Mobile devices.

efraim · on April 8, 2014

Here it is: http://www.listbox.com/member/archive/247/2013/06/sort/time_...

agapos · on April 8, 2014

Intelligence Agencies also prefer Skype that way :D

lsr0 · on June 16, 2012

skype --dbpath=~/.otherskype