Utility companies do not have redundancy for every part of their infrastructure either. Hence why severe weather or other unexpected failures can cause loss of power, internet or even running water.
Texas has had statewide power outages. Spain and Portugal suffered near-nationwide power outages last year. Many US states are heavily reliant on the same single source for water. And remember the discussions on here about Europe's reliance on Russian gas?
Then you have the XKCD sketch about how most software products are reliant on at least one piece of open source software that is maintained by a single person as a hobby.
Nobody likes a single point of failure but often the costs associated with mitigating that are much greater than the risks of having that point of failure.
> Hence why severe weather or other unexpected failures can cause loss of power, internet or even running water.
Not all utility companies have the same policies, but all have a resiliency plan to avoid blackout that is a bit more serious than "Just run it on AWS".
> Not all utility companies have the same policies, but all have a resiliency plan to avoid blackout that is a bit more serious than "Just run it on AWS".
You're arguing as if "run it on AWS" was a decision that didn't undergo the same kinds of risk assessment. As someone who's had to complete such processes (and in some companies, even define them), I can assure you that nobody of any competency runs stuff on AWS complacently.
In fact running stuff with resilience in AWS isn't even as simple as "just running it in AWS". There's a whole plethora of things to consider, and each with its own costs attached. As the meme goes "one does not simply just run something on AWS"
> nobody of any competency runs stuff on AWS complacently.
I agree with this. My point is simply that we, as an industry, are not a very competent bunch when it comes to risk management ; and that's especially true when compared to TSOs.
That doesn't mean nobody knows what they do in our industry or that shit never hits the fan elsewhere, but I would argue that it's an outlier behaviour, whereas it's the norm in more secure industries.
> As the meme goes "one does not simply just run something on AWS"
The meme has currency for a reason, unfortunately.
---
That being said, my original point was that utilities losing clients after a storm isn't the consequence of bad (or no) risk assessment ; it's the consequence of them setting up acceptable loss thresholds depending on the likelihood of an event happening, and making sure that the network as a whole can respect these SLOs while strictly respecting safety criteria.
Nobody was suggesting that loss of utilities is a result of bad risk management. We are saying that all competent businesses run risk management and for most businesses, the cost of AWS being down is less than the cost of going multi cloud.
This is particularly true when Amazon hand out credits like candy. So you just need to moan to your AWS account manager about the service interruption and you’ll be covered.
Texas has had statewide power outages. Spain and Portugal suffered near-nationwide power outages last year. Many US states are heavily reliant on the same single source for water. And remember the discussions on here about Europe's reliance on Russian gas?
Then you have the XKCD sketch about how most software products are reliant on at least one piece of open source software that is maintained by a single person as a hobby.
Nobody likes a single point of failure but often the costs associated with mitigating that are much greater than the risks of having that point of failure.
This is why "risk assessments" are a thing.