Companies that care about uptime generally use pager duty & have an on-call rotation. Instructions to handle common incidents are maintained on internal documents called runbooks. These are written in plain English & typically maintained on an internal wiki. First, let’s see some challenges with current form of runbooks:
- You need to manually execute each step, no automation
- It’s quite an effort to get everybody onboard to keep the runbooks up to date
- Unless well written, there can be ambiguity/confusion in following instructions
To tackle some of these problems I am proposing a use of novel data science tool, Jupyter Notebooks. Notebooks are a unique combination of markdown text, executable code, and output all within a single document served in a browser. This combination of features fits the runbook need very well. Let’s see with an actual example. Here’s a before & after picture of simple Gitlab runbook converted into Jupyter Notebook. (kudos to Gitlab for making their runbooks public).
Continue reading “Codify Infra Runbooks with Jupyter Style Notebook”
Apache Kafka has grown a lot in functionality & reach in last couple of years. It’s used in production by one third of the Fortune 500, including 7 of the top 10 global banks, 8 of the top 10 insurance companies, 9 of the top 10 U.S. telecom companies [source].
This article gives you a quick tour of the core functionality offered by Kafka. I present lot of examples to help you understand common usage patterns. Hopefully you’ll find some correlation with your own workflows & start leveraging the power of Kafka. Let’s start by looking at 2 core functionalities offered by Kafka.
Continue reading “Apache Kafka: Is It Right For You?”
Earlier last week I was working on a python package that would compress numerical series into strings (and back). The package is now available on Github. It gets you around ~80% compression. It is useful to store or transmit stock prices, monitoring & other time series data in compressed string format.
Continue reading “Releasing NumCompress”
Estimating software development effort is tricky and can often be a point of contention in the team. I have seen teams spending an entire day planning a 2 week sprint, working out every detail of every task. After a few sprints they have some sense of their team’s “velocity” and could tell how much work their team can deliver in a sprint. And I have seen teams where developers huddle casually, take on major features, gives out rough dates, and miss their deadline by a lot.
I can’t propose a generic process for estimation as it’s highly dependent on the nature of business, team size, culture etc. And frankly I don’t even know what the best process looks like. Instead I am going to share some common pitfalls and best practices that are applicable for any development team. This is a useful read for developers and product owners alike. Let’s dive in!
Continue reading “The Science of Software Estimation”
Elasticsearch is primarily known for it’s search capabilities but it’s also very well suited for storage, aggregation, and querying of time series data. In this tutorial, we’ll learn how to use Elasticsearch to store simple metrics and visualize them with Kibana.
To summarize, we’ll generate dummy signup data with this script. Ingest it into locally running Elasticsearch. Use Kibana to visualize the data in different ways. For simplicity, we are not using Logstash in this tutorial but you can easily configure the same data to be ingested through Logstash. Let’s dive in!
Continue reading “Metric Visualization With Elasticsearch & Kibana”