>...the body of the article doesn’t describe a panel of physicians making predictions at all. The headline says “AI fares better than doctors,” but the text says the model outperformed “risk scores currently relied upon by doctors,” i.e., standard scoring tools clinicians use—not the judgments of the surgeons on the case or an outside panel.
This in general should be expected. A ML does certain amount of fitting. As such end results are probably better than human done fitting. The trade-off to my understanding is that you might not understand the algorithm used.
You need to ask do you prefer better black box or weaker white box which you can understand and reason about. For many tasks black box is fine. For this I wonder which one I would prefer...
I get the feeling that this is one of those things where you s/AI/statistics/g. Doctors using a predictive statistical model trained on thousands of patients' worth of data faring better than doctors using the seat of their pants makes total sense.
I actually think ML models would excel here. Humans are famously bad at estimating and weighing risks and there's really only so much data a single human brain can store and draw conclusions from. Not to mention bias like female patients being chronically under-diagnosed by male doctors.
If you fed a mountain of surgery outcome data into an ML model, I imagine it'd be shockingly effective and (hopefully) less biased on sex and race.
It'd probably be helpful for initial diagnosis, but I'm less confident in that. Postop risk assessment is mostly straight statistics, and statistical inference is what ML models do. Diagnosis is a bit more subjective and complex, though it is in the same general domain.
The real trick is going to be conditioning doctors to not blindly trust the risk assessment model. Though I would hope that it'd be accurate enough for that anyway
The article mentions a previously existing risk scoring system, which was presumably already trying to deal with the problem of humans not being great at evaluating the risk.
"Fares Better" sounds unscientific and very much like click bait
In cases where the numbers suggest that the average treated person "Fares better" than barely over 50% of the control group, or when effects are inconsistent, readers may not interpret the effects as profound.
Providing real numbers that are easily understandable, rather than evocative descriptions, allows readers to form their own conclusions about the results.
Human doctors have a tendency to underestimate their own complication rate, often because they are too delusional about their own capabilities. I've heard the same doctor say "this has never happened to me in my 20 years of doing surgery" twice, when a complication occurs during a surgical procedure.
1) No, machine learning perfoms better than typical "risk scores" such as the RCRI (it was not tested against doctors clinical judgement)
2) Even so...so what? What we don't have is any reliable way to reduce surgical complications when the benefit outweighs the risk when the risk is elevated
For 2) I guess that if you know you will die with high probability, you will search for ( if at all possible) an alternative treatment ( which might have side effects but at least you’re alive ) ?
It doesn't really work like that for the most part
If you actually need a really high risk surgery, you probably have a terrible prognosis without it
For instance, in the pivotal trial of transcatheter aortic valve replacement for aortic stenosis (TAVR) the people were deemed too high risk for surgery, so got nothing (well, medicine only which doesn't really change anything for this condition) or TAVR. The medicine arm had 50% mortality (1 year I think?) whereas the TAVR arm was "only" 30%!
Now that didn't mean all those 30% of deaths were due to the procedure or even the aortic stenosis. I think that ran 10% or so (going off memory here). They just had so many other problems. For comparison, TAVR is now done in low-risk people, and I think the 1 year mortality is <3%
The things that go into making someone "high risk" in the STS (cardiac surgery) risk score are for the most part pretty obvious. If your heart muscle is super weak (or you need a machine to keep going before surgery), you have kidney failure, prior strokes, combined heart problems, bad liver or lung disease, etc etc. You can calculate a score, but you probably can guess it from the door of the room
The headline definitely evokes such an idea, but the detail in the article simply shows the machine learning system better augmenting the doctors' work
The strange thing is that this potentially life-saving tech will only collect dust because AI in medicine is only good for papers but not for real world usage. See all other AI medicine advancements. Same pattern. Medicine has a problem of not being willed to use modern tech to save lives.
>...the body of the article doesn’t describe a panel of physicians making predictions at all. The headline says “AI fares better than doctors,” but the text says the model outperformed “risk scores currently relied upon by doctors,” i.e., standard scoring tools clinicians use—not the judgments of the surgeons on the case or an outside panel.