Codify Infra Runbooks with Jupyter Style Notebook

Problem

Companies that care about uptime generally use pager duty & have an on-call rotation. Instructions to handle common incidents are maintained on internal documents called runbooks. These are written in plain English & typically maintained on an internal wiki. First, let’s see some challenges with current form of runbooks:

  • You need to manually execute each step, no automation
  • It’s quite an effort to get everybody onboard to keep the runbooks up to date
  • Unless well written, there can be ambiguity/confusion in following instructions

Solution

To tackle some of these problems I am proposing a use of novel data science tool, Jupyter Notebooks. Notebooks are a unique combination of markdown text, executable code, and output all within a single document served in a browser. This combination of features fits the runbook need very well. Let’s see with an actual example. Here’s a before & after picture of simple Gitlab runbook converted into Jupyter Notebook. (kudos to Gitlab for making their runbooks public).

Continue reading “Codify Infra Runbooks with Jupyter Style Notebook”