> Initially, we were unable to add capacity to the metadata service because it was under such high load, preventing us from successfully making the requisite administrative requests.
It isn't about spinning up ec2 instances or provisioning hardware. It is about logically adding the capacity to the system. The metadata service is a storage service, so adding capacity necessitates data movement. There are a lot of things that need to happen to add capacity while maintaining data correctness and availability (mind at this point, it was still trying to fulfill all requests)
The very large 2017 AWS outage originated in s3. Maybe you're thinking about a different event?
https://share.google/HBaV4ZMpxPEpnDvU9