This. Statistics is a very young field. The entire idea it rests on[1] is highly unintuitive, so it's no wonder it takes time.
----
[1]: Exchangeability, the fact that you can make useful progress in analysing a case by ignoring most of the information about that specific case and instead lumping it into a reference class of "similar" cases, where similar ultimately rests on subjective judgment, other than in a few special cases.
I definitely agree that statistics is a young field, but I would not agree with your definition. This is naive and statistics is much more. As HN readers that know me will attest, I'll frequently rant about how low order approximations can frequently lead to inaccurate or even results that lead in the wrong direction. Statistics actually allows us to see this! There's many famous paradoxes that are really dependent upon this. For simplicity sake's I'll reference Simpson's[0] and Berkson's[1] which happen because of inappropriate aggregation (which inappropriate aggregation is one of the most common errors in statistical modeling, but not always easy to notice).
Rather what statistics is about is modeling things that either have imperfect information and/or are probabilistic in nature (yes, there are real probabilities). The point isn't to ignore information (we actually want as much as we can get, which is why we're in the information age and why statistics has exploded with innovation) but to recognize that with observations and sampling that we can extract patterns that still allow us to make predictions without all the information. For example, statistics is commonly used in thermodynamics because there is imperfect information[2], in that we can't measure where each individual particle is at a given point in time. Instead we can aggregate information about the particles and then make aggregated predictions about future states. It is important to note that the predictions are also probabilities.
[2] thermodynamics also involves true probabilities but I think the case is still motivating. We can also use statistics with purely deterministic processes with perfect information, but it is frequently computationally inefficient.
I agree with everything you say! But I would still argue that the things you describe (prediction of the partially known based on limited information) is built on probability theory (sort of applying it in reverse), which in turn is based on assigning cases to reference classes and treating them as exchangeable in the ways that matter for the analysis at hand.
----
[1]: Exchangeability, the fact that you can make useful progress in analysing a case by ignoring most of the information about that specific case and instead lumping it into a reference class of "similar" cases, where similar ultimately rests on subjective judgment, other than in a few special cases.