Statsdcc is a Statsd-compatible high performance multi-threaded network daemon written in C++. It aggregates stats and sends the results to backends, especially Graphite. We are proud to announce that we are opensourcing it today. Check out the code at https://github.com/wayfair/statsdcc.
At Wayfair we’re big believers in “measure anything, measure everything,” as the “Statsd is reborn in node.js at Etsy” announcement put it. We do application performance monitoring with the opensource tools Graphite, Grafana, the ELK stack (Elastic Search/Logstash/Kibana), and some homegrown tools. Until recently we had been using Flickr/Etsy’s 2nd-generation, node.js-based Statsd to collect metrics for Graphite. As the volume of these metrics increased, we noticed inconsistencies in the data, and realized that some metrics were being dropped. Long story short, we tried some architectural changes, scaling Statsd and Carbon horizontally (details below), but as the operational complexity of that increased, we began to wonder why we needed so many boxes. We found a bottleneck in the way Statsd buffers and flushes data to Carbon, and we decided we needed a different version.
There are already quite a few alternative Statsd implementations available, but none of them really came close to meeting all of our needs. Brubeck by github is one that we found interesting, because it promised high throughput. Unfortunately, it was released after we had Statsdcc implemented and were ready to put it into production. At that point, we had no reason to take Brubeck and extend it to support the features we needed. However, we borrowed the idea of integrating a webserver to view application health from Brubeck. Statsdcc and Brubeck try to solve similar problems. I would recommend checking out all these implementations and picking the one that best fits your needs.
If you’re interested in what we tried before starting to hack the C++, read on.
Attempts at Horizontal Scaling with Statsd:
Statsd performs aggregations on incoming metrics and sends the aggregates to a Carbon process, which in turn saves received metrics to a Whisper database.
To scale, we use multiple Statsd/Carbon chains. Each chain goes to a different disk. Proxy daemons hash metric names to determine which chain to use. Which proxy daemon is chosen depends on round robin DNS. Consistent hashing ensures metric names are well balanced.
The diagram below depicts the architecture.
A year ago we noticed that a certain set of metrics were being dropped, resulting in inconsistent monitoring data. We realized that this was due to a maxing out of UDP receive buffers on Statsd. So we tried adding more Statsd processes with increased UDP buffer sizes.
However, adding a new process is complicated. When a new Statsd instance is added, consistent hashing by a reverse proxy will re-route some metrics to the new process, resulting in duplicate files on different Carbon nodes for the same metric – one for the old data and one for the new data. To save space, and for Graphite to show all data, the old Whisper data files should be merged into the new ones.
In the end we were unhappy with how much traffic an individual node could handle. We discovered that the problem was a design decision in Statsd, where the same thread is responsible for both buffering incoming metrics and performing aggregations on them at every flush interval. When computing aggregations, the thread stops listening for incoming metrics, which are stored in the UDP buffer. As the rate of metrics increases, the UDP buffer overflows and drops metrics. We use single-threaded, event-looping frameworks in a few places (Node.js-based daemons for a couple of things, Python-based gunicorn+gevent for several), and we have seen this type of problem before. The event loops don’t help you when you have a blocking IO operation that can bring processing to a halt. Sometimes we work around or solve such problems within the event-loop paradigm, and sometimes we take a completely different approach.
After finding the actual root cause, we decided to rewrite Statsd as a multi-threaded application with a focus on effective use of socket-IO and CPU cycles.
Statsdcc is an alternative implementation for Statsd written in C++ for high performance. In Statsdcc, one or more server threads actively listen for incoming metrics. Server threads distribute incoming metrics among multiple workers using the formula worker = hash(metric name) % #workers. Worker threads read from their dedicated queues and update their ledgers until signaled to flush by a clock thread. Upon receiving this signal, the worker threads hand off their ledgers to short-lived flush threads, and continue with new ledgers until the next signal. To avoid lock contention and to pass metrics faster between server and worker threads, boost’s lock-free queues are used.
We have not gotten rid of consistent hashing as we did not want to lose the ability to scale horizontally. However, to solve the scaling problem in our previous architecture, where adding a new process required cleanup on the Carbon end, we moved consistent hashing from proxies to aggregators. The proxies distribute the incoming metrics among multiple aggregators using the formula aggregator = hash(metric name) % #aggregators. Each aggregator then sends the metric aggregation to its respective Carbon process by using the consistent hash. The difference from the previous architecture is that the Carbon process has more TCP connections open, one with each aggregator. However, unlike Statsd, instead of reopening connection on each flush, Statsdcc reuses established TCP connections, thereby avoiding the overhead of a TCP handshake. The diagram below describes the current architecture.
Statsdcc can handle up to 10 times more load (up to 400,000 metrics/sec) than Etsy’s Statsd. Only one instance of the Statsdcc aggregator handles all our production traffic, in contrast to the previous 12 Statsd instances. Statsdcc has been used in production for about 7 months. We hope more people will find Statsdcc as useful as we have at Wayfair.