Google - Site Reliability Engineering
Poorly performing patterns are often merely symptoms of an underlying problem. Addressing symptoms may ease the pain, but it does little to ensure sustainability. For that we need to expose the problem’s root cause. This can be done using simple yet robust techniques called “root cause analyses.” While there are a great many to choose from, we
... See moreTonianne DeMaria Barry • Personal Kanban: Mapping Work | Navigating Life
Don’t underestimate the value of digging into history to investigate some bugs
I’ve always been pretty good at debugging weird issues, with the usual toolkit of println and the debugger. So I never really looked at git much to figure out the history of a bug. But for some bugs it’s crucial.
I recently had an issue with my server where it was leaking... See more
I’ve always been pretty good at debugging weird issues, with the usual toolkit of println and the debugger. So I never really looked at git much to figure out the history of a bug. But for some bugs it’s crucial.
I recently had an issue with my server where it was leaking... See more
Marcus • Marcus' Blog
When something breaks, troubleshooting the issue should be possible from only the metrics being collected. You should not be to depending on log files, or looking at code, to deal with an outage.