I’m proud to announce a new open source project that we’ve been cooking up behind the scenes at Paperless Post. Graphiti is a powerful front end for the real-time graphing engine, Graphite. Graphiti provides a whole new way of creating and storing graphs. We’re using it day-to-day at Paperless as a way to monitor our infrastructure and product metrics. We’re especially excited because it’s the first part of our internal tools to be open sourced under the paperlesspost organization.
The importance of trends
There’s been a big push in the development community on the need and desire for metrics to drive your business, product, and development cycles. We’ve always been a fan of this approach and have, as a team, gone through a number of iterations to improve our ability to track as much data as we can across our applications. Over time we’ve settled upon two distinct sets of data: accounting and trends.
Accounting data is not real-time, but it is precise. It’s generated nightly by some very awesome and complex SQL functions and collations by our venerable analytics lead, Vanessa Hurst. These numbers are hard facts and they give us the information we need to make important big decisions.
Trending data is the other end of the spectrum. It is real-time and imprecise. Its lack of precision is a feature; we don’t want the collection of these metrics to negatively impact the performance of an application. Instead of looking at numbers, when looking at trending information, you’re looking at deltas, the relative changes between numbers across time. When looking at trending data, you’re looking for sharp spikes or dips, wild fluctuations, steady increases or decreases over time. These deltas can guide debugging, optimization, and the general wellbeing and growth of your product.
A little over a year ago we hacked together a very simple trends dashboard that collects metrics data in Redis and displays some raw real-time numbers and some graphs that compare metric data over time. It’s been improved over time, but it lacks the speed and flexibility to have it be used for a very large number of metrics. When we wanted to track everything and we wanted to be able to create graphs on the fly of any data we collected.
Graphite, created at Orbitz.com in 2006, fits our needs very closely. Etsy caught on to it early and released StatsD, a daemon for dumping data into Graphite. StatsD fit amazingly well into our existing tooling. With a simple Ruby wrapper and a lot of
Paperless::Statsd.increment('logins') calls sprinkled throughout our code, we had auto-namespaced data headed to StatsD and then into Graphite – we were well on our way to graphing nirvana. We quickly grew the number of stats we collected by a couple orders of magnitude in only a few days. When we went to the Graphite UI to create graphs, well, we were a little disappointed. It’s not that Graphite is horrible (in fact its pretty amazing how detailed and flexible the composer is) its just that it wasn’t how we wanted to create graphs. It’s lack of easy memory and session made it really hard to create and save complex graphs.
Graphite’s true power lies in its ability to collect data and render graphs. We realized that behind its clunky UI, there’s a very powerful URL API that allows you compose any graph using simple query params. Talking to Aman over some beers we realized we were both headed in a very similar direction. He shared a bit of code that GitHub uses for generating their graphs and from that and some late hours hacking sessions, @mrb and I came up with Graphiti.
Everyone needs a dashboard
The main problems we wanted to solve with Graphiti mainly had to do with the ability to easily manipulate and create graphs (in this case through a JSON format and interactive editor). However, once we started, we came up with some additional ideas that are now key parts of the app.
Dashboards are arbitrary named collections of graphs. This allows dashboards to be really flexible collections for different use cases. Every member of our team can have their own personal named dashboard (even our product and management teams) and we can also have dashboards for different areas and applications (Resque, Rails, Card Creation, and so forth).
Snapshots are point-in-time captures of the graph images that are uploaded to S3 for posterity. This allows us to have shareable graphs whose URLs we can paste anywhere. These graphs don’t change with time – they’re the same as when they’re pasted and they are also accessible from anywhere, most importantly, outside our VPN. We’re using this as a way to capture and paste graphs into Campfire and for remote debugging/monitoring.
Graphiti is definitely a work in progress. It’s probably a little buggy and there are some missing features that we want to add. We wanted to open source it, not only because we promised we would, but also because we think the community can help us make it better.
Also, watch this space and follow @paperlessdev as we have some more posts about graphiti and graphite in the works.
If you’re interested in working on cool projects like this, or with our awesome team, we’re hiring in NYC and SF.