I think people that have issues scaling any modern distributed data stack are because a) Don't have experts or b) Bad practices/stretching the use case. I worked on a project once where the ES cluster performance was degrading because they kept increasing the number of fields. At some point, they had more than 5k for a single document schema even though ES docs mention going over the limit (1k) is not a good idea. I mean if any of these big tech companies can manage clusters of hundreds of nodes for any of these data stacks I'm sure your scaling issues aren't because of the tool.