It's not my area of expertise, but I have seen other estimates that American adults, especially men, are likely to have and report numbers of friends such that the median is in the single digits.
So, in reference to the "reasoning" models that the article references, is it possible that the increased error rate of those models vs. non-reasoning models is simply a function of the reasoning process introducing more tokens into context, and that because each such token may itself introduce wrong information, the risk of error is compounded? Or rather, generating more tokens with a fixed error rate must, on average, necessarily produce more errors?
Its a symptom of asking the models to provide answers that are not exactly in the training set, so the internal interpolation that the models do probably runs into edge cases where statistically it goes down the wrong path.
This is exactly it, it’s the result of RLVR, where we force the model to reason about how to get to an answer when that information isn’t in its base training.
reply