Perhaps what has been missed is the difference between a single coin flip and a combination of coin flips?
Consider one startup.
f(x) = #fail
succeeds better than 50%, and
f(x) = #succeed
succeeds less than 50%. This is due to the nature of startups.
Sure, it's easy to get about 50% accuracy for one startup by flipping a coin:
f(x) = if rand(1) > 0.5 then #fail else #succeed
But consider the case of two companies A and B. There are now four outcomes:
A = #fail, B = #fail
A = #succeed, B = #fail
A = #fail, B = #succeed
A = #succeed, B = #succeed
If we flip a coin, we have to flip it twice. Our probability that two coin flips match the correct tuple is 25%, and bumping that up to 50% is a massive improvement.
Investors diversify their portfolios. In a portfolio of 100 startups there's probably a winner. Improving the selection of companies means reducing the number of a fund's portfolio companies necessary for a reasonable probability of a winner. More smaller yet successful funds makes capital more efficient.
Better pruning of boolean search spaces has real value. Hence:
When predicating that a company will fail,
he adds, they’re right 88 percent of the time.
That's... not accurate. There are two success conditions (Fail, Fail) and (Succeed, Succeed), and two failure conditions (Succeed, Fail), (Fail, Succeed). If you flip two coins, the chance that they come up both heads is 25%, but the chance that they come up the same is 50%.
The odds of (#fail, #fail) for two startups (A, B) are much greater than 50%. When investing in two companies (#fail, #fail) is not #success.
At a 1% #success probability the failure rate is ~98%. The 1% #success rate is based on the a knowledgeable person choosing A and independently choosing B. If that person can obtain information that lets them improve their selections to %2 #success probability, they can reduce the total number of investments necessary to achieve any particular expected return on investment.
Reducing the number of investments may improve the investor's ability to influencing the outcome of each company in their portfolio, because the investor can allocate more time, energy, and resources to each company in their portfolio [resuming the investor brings business expertise to the table].
The article is claiming that the guy can predict which things are going to succeed and which are going to fail, not make them succeed or fail. His evidence for this is that 50% of the time, he's right (e.g. [success, success], [fail, fail] are both success conditions for him). For him to be adding any information to the system, he has to get it right more often than either choosing randomly or using a fixed zero-information strategy (always bet fail/always bet succeed). You explicitly said that the joint probability matters, but then miscalculated the joint probability of him guessing correctly by random chance.
I wrote this below, but several things are clear here:
- This isn't a quote and should be taken with a grain of salt. Oversimplification, poor wording, and basic misunderstanding on the part of the author are at fault.
- We don't know what the models outputs are. If they are simply SUCCEED / FAIL, then yes, 50% correct is not very helpful (unless of course it is right more than 50% of the time on big winners). If the outputs are more granular (likelihood of success, expected ROI, etc), then being "right" means a lot less and, to the extent that it does mean something, being right 50% of the time is much more helpful.
Imagine being right 50% of the time guessing about getting through airport security. If you're guesses are "WILL" or "WON'T", then 50% is terrible. If you're guesses are like "through in 23 min 53 sec" then 50% is incredible. If you're guesses are like "70% of being through in 15-20 minutes", what does "right" mean?
To be fair, "wrong" is not describing everything. If it has a lot of false negatives (says "fail" when startup succeeds), but very few false positives, that would really be worth something.
Not really. What the article says is, the model would predict 50% of time whether a company will fail or not, which doesn't make sense, because 50% for a binary prediction (i.e. fail or not) is exactly nothing.
So maybe it's just bad or confusing wording in the article, the guy actually meant to say something else.
Yeah I'm going with confusing wording, I think he meant what I said above. Not to mention he could be referring to eliminating false positives, which is slightly different to finding true positives versus true negatives.
I think it's easier to relate to a coin flip if we use an "unfair coin".
90% of the time the coin flip returns tails (aka fail).
10% of the time it returns heads (aka win).
For a given coin flip, their algorithm can predict the results 50% of the time. At this point I don't remember the calculations off the top of my head, but it involves a Binomial distribution.
I would be incredibly interested to know how the model was crossvalidated. It is utterly trivial to cherry pick features post-facto to correctly predict winners, however people who are unfamiliar with machine learning might not know this.
"He admits the models will never be perfect, but thinks that even a model that’s only right about 50 percent of the time could help investors and entrepreneurs avoid particularly bad ideas that, to the untrained eye, look like excellent opportunities."
He basically admits his model is no better than a monkey
Not necessarily, as the distribution of startup returns is highly skewed and you don't know the correlation of his model prediction vs relative return.
Ie, if his model consistently predicts success of those companies deep in the right tail (ie, your Facebooks and Twitters), but is less correct about just 'mildly' successful startups, the algorithm will greatly outperform a coin flip.
Eh, this isn't a statement about his own models, and it's obviously a gross oversimplification anyways. You should focus on some of the earlier statements which give an indication of that his models aren't just a "GOOD/BAD" classification. As a silly example, if my model which predicts age based on a photo was right 50% of the time, that's pretty good because there are more than two ages. If it predicted birthday and was right 50% of the time, that would be incredible.
Furthermore, the models should probably be described as forecasts and not predictions, and as such can't be right or wrong. Which is mainly just to emphasize that the statement is an oversimplification.
A coinflip is always 50% accurate, no matter how improbable the event you're trying to predict. If I flip a coin to predict whether Cthulhu will rise tomorrow, I have a 50% chance of getting it right.
I'm not sure having better predictors of what businesses are good is necessarily a good idea. Part of the attraction of silicon valley is that it takes some of the risk out of trying new things, even if they might be bad ideas. This culture of trying things leads us to find the occasional really good idea. If we sit around all day plugging our ideas into models to see if, statistically speaking, the will succeed, we won't find the really novel ideas that look bad but are actually good.
Retroactively fitting a model to prove history is not that difficult. Accurately predicting the future may be a bit trickier. Just ask technical stock traders.
Well it depends on proper cross-validation, sample size and of course predictability (i.e. 4 * P(Suc.|data) * P(Fail.|data)<<1). But online prediction is of course what will tell the long term reliability of any model.
He admits the models will never be perfect, but thinks
that even a model that’s only right about 50 percent of
the time could help investors and entrepreneurs avoid
particularly bad ideas
dunno why but I can't stop thinking that tossing a coin could achieve the same goal :)
Its not clear from the article which factor the 50% figure applies. I doubt its the overall odds ratio, but rather the likelihood given some priors. For example, given a prior that 90% of startups fail, what is the likelihood that this particular startup will succeed? If the prior is already factored in, then the model predicts no better than a coin flip.
The point made in the article is that investing at the odds of a coin flip would be better than investing with incorrect risk assumptions (ie. buying into "particularly bad ideas").
Assume startups fail 90% of the time. 50% of the time your coin comes up tails, and you claim the startup will fail. 50% of the time your coin comes up heads and you claim the startup will succeed.
You guess correctly 0.5 (the chance your coin comes up tails) * 0.9 (the chance the startup fails) + 0.5 (heads) * 0.1 (success) = 0.5 of the time.
I wish this guy the best in improving his algorithm. If it really worked, it could do a lot of good. But the economy is so complex, I doubt he'll ever make the model more accurate than a coin toss. I predict the model will just make people overconfident in their investment decisions.
Investing is anti-inductive; if his algorithm actually starts to be used in investment decisions, people will keep gaming it until it will no longer be a useful signal.
Misleading title but an interesting article nonetheless. The comments here really show how much statistics are misunderstood by the public, even by a more technical-minded crowd.
title: "The Man Who Knows Whether Any Startup Will Live or Die"
text: ""He admits the models will never be perfect, but thinks that even a model that’s only right about 50 percent of the time could help investors and entrepreneurs..."
Which is not surprising... if the title was really true, this man would likely be richer than Warren Buffett at this point.
Yes. From the article: "According to Thurston’s own model, Growth Science’s own chance of survival following its current business model is about 69 percent. Adding the automated service would actually improve its chances, he says."
Hi Y'all, Thomas here (guy in article). Just want to start by saying (1) this is a very intelligent thread, and (2) I didn't write the article, was just interviewed for it. You never know what's going to be written, no matter what you say.
Here's how the models really play out.
We compare our accuracy against the 10 year survivorship benchmark of 25% (not the 5 year). When you look at small businesses, VC-backed, and corporate ventures (ex. new products coming out of companies), the 10 year survival rate is around 25%, plus or minus 10% depending on the industry.
Our models have made thousands of predictions for around nine years now - all the predictions were live, real-time and forward looking (no back-testing included here). From those predictions, around 3,400 have matured to date. That is, only around 3,400 of the results have happened - the businesses have either become big successes (ex. Uber) or failed. In our research, we have to actually wait for businesses to live or die to test our accuracy.
From the roughly 3,400 predictions that have matured, we were right 66% of the time when predicting survivors, and 88% of the time when predicting failures. When we scratched beneath the surface, we were really around 66% accurate in both cases (just most businesses fail, which is why gloomy predictions were 22% more accurate - just a function of dumb luck since most things die).
So we consider our algorithms to be 66% accurate, which is much more accurate than anything we're aware of in human history (remember, the baseline we're compared against is 25%).
If you do a statistical analysis (to make sure our predictions weren't just luck), the models maintained a statistically significant correlation with 99% confidence. There was less than 1 chance in over 500,000 that the results were a function of luck (definitely not a coin toss).
We've used these models in venture, and our performance puts us in the top 5% of all VC funds for our vintage years, so we've monetized these models effectively with real dollars and made considerable gains.
I hope this gives folks a better sense for how it works. There's been a very emotional backlash to the Wired article today (not accusing this thread, just thinking of some others) and it's weird because it's just basic scientific research. Pretty drab stuff on most days, but apparently offensive to some people. Not sure why. We're using statistics to improve venture and startup odds, just like stats have been used to improve just about every other field humans have ever taken seriously. Seems obvious that stats are similarly useful in the startup world, and my dream has always been to help more businesses use stats to succeed.
Anyway, definitely a lot more controversy and emotion than I would have expected. Otherwise pretty basic science, not claiming perfection, just striving for improvement, using data as best we can, etc. I hope at least some folks see this for what it is - nothing out of the ordinary in any other domain of science. Why should entrepreneurship be any different?
So if the baseline is 25% survive, then 75% fail, for a sum of 100%. How is it correct to compare that baseline to two numbers (66% and 88%) which don't sum to 100? Is failing different from not surviving?
1. The models predicted survival, and the business survived
2. Predicted survival, but the business failed
3. Predicted failure, but the business survived,
4. Predicted failure, and the business failed
Here's how our results stacked up:
#1. 69% of observations (it wobbles around 66%, but was 69% in the last update)
#2. 31% of observations (notice #1 + #2 = 100%. Out of 100% of the times we predicted survival, we were right 69% and wrong 31%)
#3. 12%
#4. 88% (again, #3 + #4 = 100%. Out of all the times we predicted failure, we were right 88% of the time and wrong 12%).
Again, there's more luck when you predict failure, so it's really around 66% for both positives and negatives when you dig deeper into the stats.
The output of his model is contrarian to what every accelerator and VC firm in the Valley says about investing. They believe the team is the most important factor of success. Thurston says its 12%. We love pushing the envelope!
Ignoring the already-discussed issue of overfitting, I'm going to discuss something that surprises many, but shouldn't.
Using this process, he discovered some surprising things—most notably that a company’s team is only about 12 percent predictive of a company’s success.
I'm surprised that it's even that high. In the real world (e.g. outside of the VC-funded world) the team is important, because they're going to grow the company from scratch, and that requires getting more things right than wrong over 5+ years. On the other hand, if you're Snapchat, you have several investor-level people working to protect the company from its founder-quality problems and its own worst impulses, and the company will IPO or be bought before its cultural rot reaches a critical point.
The Valley's founder-quality problem creates a lot of awful corporate cultures, and it has dragged the status of engineers way down, and it's generally been bad for the world... but it doesn't kill businesses because the investors are able to keep enough of them on track to produce successes. The influence of VC "rocket fuel" and guidance is why a company can have terrible founders and still succeed.
There is a factor at play here, I think, and it's the age. Many investors are simply older and are more experienced individuals. I'm not even sure if their VC status matters here any more than their age.
(In a rather comical example, two youngsters want to put together another chat app with cool new smiley icons, then two VCs step in and ask: how are you going to monetize on smileys?)
Investors should definitely be viewed as part of the team - one, and two, age should also be taken into account. I wonder if Growth Science considers this as well.
>> Using this process, he discovered some surprising things—most notably that a company’s team is only about 12 percent predictive of a company’s success. “You need to find a good team that won’t ruin the company, but hiring ‘rock stars’ isn’t that great,” he explains. The market the company is entering is far more important than who’s running the company.
Makes you wonder why we're paying these people so much money.
I wonder if they use Bayesian logic, because they should.
I also suspect that even though the article does not disclose a lot, the major factor at play in their model is the market sizes. I'm pretty sure their online version would heavily rely on the industry/segment you select. In other words it's a business-plan-looking-good approach which in today's rapidly changing world becomes less and less relevant. So I'll remain skeptical about it.
> He [...] thinks that even a model that’s only right about 50 percent of the time could help investors and entrepreneurs avoid particularly bad ideas
...does he have a bridge to sell me too? What am I missing?
(One can simply predict "always succeeds" and will be right half the time.)