Ah, I'll write some stuff in a thread that nobody is going to read....
Survivorship issues are not intuitive. Let me try and give you a cooked case to make it more clear, but you'll actually think through the math and be willing to think through Bayesian stats. The problem is that it takes considerable type 2 thinking, and most people simply don't want to think this hard.
So, here is out cooked case, which I think you'll see it's applicability.
In a trial of 110 participants testing a medication intended to prevent depression, ten individuals drop out for reasons that are not recorded, and among the 100 who remain, ten are known to have become depressed. Under the assumption that dropouts have the same depression rate as completers (10 percent), we would attribute one depressed case to the ten dropouts (10 percent of 10) and nine non-depressed cases, yielding a total of eleven depressed participants out of 110.
This is what the authors of the study assumed, and they stated as much. So, we have a depression rate of 11 out of 110 as suffering from depression.
Let's call this "Case A."
In contrast, under the worst-case assumption that every dropout did so because they became depressed, all ten dropouts would be counted as depressed in addition to the ten known depressed completers, for a total of twenty depressed participants out of 110.
In the second case, we find out that we have 20 out of the 110 suffering from depression.
Let's call this "Case B."
The problem is that you actually need to test the entire population of depressed people to understand if the anti-depression drug is effective. The problem you have is that you just placed your population through a survivorship screen. To take this a step further, you'll recognize this as not understanding Bayesian probability.
When you assume that dropouts have the same depression rate as completers, you effectively treat the probability that a participant who leaves the study is depressed—P(depressed | dropout)—as equal to the overall observed rate P(depressed) among those who stay.
In Bayesian terms, you ignore any information the fact of dropping out might carry. Formally, you set P(depressed | dropout) = P(depressed) = 10/100 = 10%, even though the act of dropping out could itself be evidence of depression (or of other factors).
A proper Bayesian approach would start with a prior probability for depression, then update it with the likelihood that depressed individuals drop out at a different rate than non-depressed ones. By assuming equal rates, you sidestep that update step—and any differential dropout risk—thereby discarding potentially informative evidence and underestimating the true depression rate if depressed participants are more prone to leave.
Of course, we have to layer this on top of the thought that your antidepressive drug is actually more effective in the population that dropped out rather than one that stayed in. However, I don't think this is too far of a stretch. If it turns out that people drop out because they are depressed or more depressed and people that stay in are less depressed, we are clearly dealing with two different population samples. Our problem is not in that we cannot formulate a case for suggesting the two should be together, The issue is that our confirmation bias is at play here which turns out to be extremely problematic to drive through to valid results.
Now, intuitively you're probably thinking "but can it really make that much difference?"
I like to quote the famous red/blue cab as an example from Kahneman and Tversky.
Imagine a city where 85 percent of cabs are green and 15 percent are blue. A hit-and-run occurs at night, and a witness—with 80 percent accuracy in identifying cab colors—says the fleeing cab was blue. Many intuitively conclude there’s an 80 percent chance it was blue. In reality, a Bayesian calculation shows the probability is only about 41 percent. Despite the witness’s reliability, the overwhelming predominance of green cabs (the base rate) means most identifications—even correct ones—are still more likely to involve green vehicles.
Ignoring that base rate leads us to dramatically overestimate the chance that the cab was blue.
Wikipedia has a nice base rate page called base rate fallacy. Also,"Why Most Published Research Findings Are False" by John Ioannidis, is pretty mindblowing.
Survivorship issues are not intuitive. Let me try and give you a cooked case to make it more clear, but you'll actually think through the math and be willing to think through Bayesian stats. The problem is that it takes considerable type 2 thinking, and most people simply don't want to think this hard.
So, here is out cooked case, which I think you'll see it's applicability.
In a trial of 110 participants testing a medication intended to prevent depression, ten individuals drop out for reasons that are not recorded, and among the 100 who remain, ten are known to have become depressed. Under the assumption that dropouts have the same depression rate as completers (10 percent), we would attribute one depressed case to the ten dropouts (10 percent of 10) and nine non-depressed cases, yielding a total of eleven depressed participants out of 110.
This is what the authors of the study assumed, and they stated as much. So, we have a depression rate of 11 out of 110 as suffering from depression.
Let's call this "Case A."
In contrast, under the worst-case assumption that every dropout did so because they became depressed, all ten dropouts would be counted as depressed in addition to the ten known depressed completers, for a total of twenty depressed participants out of 110.
In the second case, we find out that we have 20 out of the 110 suffering from depression.
Let's call this "Case B."
The problem is that you actually need to test the entire population of depressed people to understand if the anti-depression drug is effective. The problem you have is that you just placed your population through a survivorship screen. To take this a step further, you'll recognize this as not understanding Bayesian probability.
When you assume that dropouts have the same depression rate as completers, you effectively treat the probability that a participant who leaves the study is depressed—P(depressed | dropout)—as equal to the overall observed rate P(depressed) among those who stay.
In Bayesian terms, you ignore any information the fact of dropping out might carry. Formally, you set P(depressed | dropout) = P(depressed) = 10/100 = 10%, even though the act of dropping out could itself be evidence of depression (or of other factors).
A proper Bayesian approach would start with a prior probability for depression, then update it with the likelihood that depressed individuals drop out at a different rate than non-depressed ones. By assuming equal rates, you sidestep that update step—and any differential dropout risk—thereby discarding potentially informative evidence and underestimating the true depression rate if depressed participants are more prone to leave.
Of course, we have to layer this on top of the thought that your antidepressive drug is actually more effective in the population that dropped out rather than one that stayed in. However, I don't think this is too far of a stretch. If it turns out that people drop out because they are depressed or more depressed and people that stay in are less depressed, we are clearly dealing with two different population samples. Our problem is not in that we cannot formulate a case for suggesting the two should be together, The issue is that our confirmation bias is at play here which turns out to be extremely problematic to drive through to valid results.
Now, intuitively you're probably thinking "but can it really make that much difference?"
I like to quote the famous red/blue cab as an example from Kahneman and Tversky.
Imagine a city where 85 percent of cabs are green and 15 percent are blue. A hit-and-run occurs at night, and a witness—with 80 percent accuracy in identifying cab colors—says the fleeing cab was blue. Many intuitively conclude there’s an 80 percent chance it was blue. In reality, a Bayesian calculation shows the probability is only about 41 percent. Despite the witness’s reliability, the overwhelming predominance of green cabs (the base rate) means most identifications—even correct ones—are still more likely to involve green vehicles.
Ignoring that base rate leads us to dramatically overestimate the chance that the cab was blue.
Wikipedia has a nice base rate page called base rate fallacy. Also,"Why Most Published Research Findings Are False" by John Ioannidis, is pretty mindblowing.