Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Some of the worst mistakes that I saw were from over-reaction in an active incident.

One of my programming mantras is "no black magic." If I don't understand why something works, then it's not done.

I take this same approach to an incident. If someone can't coherently identify why their suggestion will have an impact, I don't think they should do it. Now there may come a time that you need to just pull the trigger on something, but as I think back I'm not sure that was ever the case in the end.

It was wild to see the top brass—normally very cool and composed—start suggesting arbitrary potential fixes during an incident.



I have a similar mantra - "if you don't know why a fix worked, you may not have fixed it."

I'm willing to throw shit at the wall early in the triaging process, but only when they are low-impact and "simple" things. stuff like -

have we tried clearing cache?

have we checked DNS resolver for errors?

have we restarted the server?

etc. I try to find the "dumb" problems before jumping to some wild fix. In one of the worst outages of my career, a team I was working for tried to do a full database restore, which had never been done in production, based on a guess. At 3am on a saturday. I push back really hard at stuff like that.


That mantra reminds me of "Any problem that goes away by itself can just as easily come back by itself."


To quote Gene Kranz during the start of problems on Apollo 13: "Let's not make things worse by guessing."


> One of my programming mantras is "no black magic."

This. * 1000

I am very grateful that my earliest training was as an RF bench technician. It taught me how to find and fix really weird problems, without freaking out.

However, it does appear that this particular event may have been caused by reliance on a dependency that may not have merited that reliance.

I'm really, really careful about dependencies. That's an attitude that wins me few friends, in this crowd, but it's been a long time, since I've had 2AM freakouts.


Wow, another RF/electronics tech turned SWE? How many of us do you think there are?


Not too many.

But debugging RF stuff is hairy AF, so it’s a great forge for making problem solvers.


> One of my programming mantras is "no black magic." If I don't understand why something works, then it's not done.

so many ppl nowadays don't care about knowing how thing works, they just push shit and cross the fingers.


It's a good reminder that if things get bad, people will just start burning things to try and appease the gods.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: