too idealistic. invariably some team (or usually many teams) don't properly gate some critical-path logic, they depend on some functional partition always being online and then boom much larger blast radius than intended
then they fix it in post-mortem but pattern just repeats. i have seen it so many times! used to be much worse in the earlier days of the cloud when VMs would go poof more often
then they fix it in post-mortem but pattern just repeats. i have seen it so many times! used to be much worse in the earlier days of the cloud when VMs would go poof more often