At HomeAdvisor, we typically experience surges in traffic after major weather events, as homeowners and businesses alike need additional help with snow removal, handyman repairs, etc. But until recently, we never had a good way to visualize this increase in activity on our platform. At the micro level we could monitor core server performance, but metrics like CPU and memory utilization don’t provide any context as to what is happening. And while application metrics like API requests and database queries give us a little more insight, they typically require some level of aggregation or translation to really understand them. In other words, looking at server and application metrics typically show you the symptoms of the problem without providing any diagnosis.
In this post we’ll discuss our home grown real time data visualization suite that is based on Kafka, MongoDB, Node.js, and D3.js. This platform helps us visualize key business metrics and data in real time from any web browser. For example, whenever we experience a surge in traffic, the platform quickly shows us the source of service requests, which in turn can be correlated with real world events like severe weather. This type of insight would never be obtainable through traditional server or application metrics.
One of the things we love about this tool is that it is the result of a UI developer’s project from one of our semi-annual Fedex Fridays (i.e. hackathons). Throughout the year, we take a break from our normal coding efforts and work with any new technology of our choosing. This gives the entire development team a chance to experiment with new tools and products, and many times the output of those projects are incorporated into our development or production stacks. Our real time data visualization suite is a perfect example of a product going from idea to implementation in short order. And while it’s still very much a work in progress, it’s grown substantially and has become a flexible platform for hosting a number of different data visualizations.
Considerations for Real Time Data Visualization
When we started to think about real time data visualization, there were two primary concerns:
- Our data is complex.
- Our data never stops.
To the first point, a typical service request contains not only the core data about the request itself (task, description, questions/answers, etc.) but also links to other data such as the consumer and server professional. In our standard production system, this data is spread across a number of tables in a relational database. We can only optimize the queries and tables so much before they become a bottle neck, especially for a real-time data visualization tool where latency can be painfully obvious.
And to the second point, we get service requests around the clock, every day of the year. Unlike our development team, our servers never get a holiday or time off. Our consumers and service pros cover four time zones, and with our growing suite of products and apps, there is data flowing into our system around the clock. Creating a visualization platform that can keep up with the flow of complex data from both after hours to surges and everything in between.
Building the Framework
As mentioned earlier, there are several pieces that make up the visualization framework:
- Kafka: Messaging component that provides real-time updates from production services.
- MongoDB: High performance data store.
- Node.js: Used as an API framework, web server, and Kafka consumer.
- D3.js: A data binding and visualization framework
At a high level, the interaction of these components can be summarized in the graphic below. After that we’ll go a little more in depth into the role of each product and why it was chosen.
We already use Kafka heavily in our architecture, so this was an obvious choice for us. It’s the core of our microservice messaging infrastructure and we already had a number of topics that contained the data we wanted to display. Kafka is designed from the ground up such that adding new consumers to the cluster is straight forward and very low risk. Additionally, while the main component of the visualization suite is a live feed of service request data, we also allow queries and aggregation of historical data. This requires a database that can keep up with the evolving data formats contained in the Kafka messages.
Choosing MongoDB as the mechanism for data storage lent a great amount of flexibility. As a document store, no rigid schema was needed. MongoDB is also an excellent tool for working with the time-series based data that we are storing and analyzing. Though we currently run this entire application on a very modestly powered VM, MongoDB can be scaled vertically and horizontally to suit future needs. It was also very appealing that Mongo-Connector can be used to seamlessly synchronize collections across multiple MongoDB, Solr, or ElasticSearch clusters for additional functionality and performance.
The visualization suite is powered by a number of single-purpose Node.js applications. The first component is simply our Kafka consumer. This application uses Librdkafka (a C implementation of the Apache Kafka protocol) and Kafkacat (a non-JVM consumer/producer for Kafka) and is largely configuration-based. This configuration instructs the app to listen to specific topics and to post that data periodically as “batches” (arrays of JSON) to our RESTful “Stats API”.
The Stats API is built atop Express.js and was also created for this project. It provides very generic and multi-use access to MongoDB. Data can be inserted, retrieved, queried, counted, and aggregated via these API endpoints. There are no collection-specific endpoints. Instead, the desired collection name is included as a part of each API URL. This setup allows us to quickly begin storing new Kafka events in MongoDB by simply adding them to the Kafka-consumer application’s configuration. When necessary, collection-specific data manipulation can be performed via configurable Express.js middleware that is run prior to the actual database writes. This middleware is used to transform the data, perform running calculations, and communicate “live” data to browsers via WebSockets.
The video below shows the finished product. The main display area shows service requests being submitted in real time by displaying a “ping” on an overlay of the United States. The location of the animation represents where the consumer is located, while the color of the animation represents the source and/or type of the service request (website, customer care team, booked appointment, etc).
Extending the Visualization Platform
Though the live service request data is certainly the marquee component, what we’ve really done is create a flexible framework for building any visualization we can imagine. Not only do we have the live service request map, but many other visualizations have been developed as well. With our flexible Stats API and the wide range of D3 visualization plugins, it’s pretty straight forward to add new visualizations with little effort. Some examples of other pages we have developed include a graph of time series data from system load tests and a heat map of service request data over an interval of time, both of which are shown below.
At HomeAdvisor we take a lot of pride in our ability to innovate and solve problems. When it came to visualizing system load and correlating with real world events, we didn’t have any single tool that could paint a clear picture. What started as a single developer’s side project has now turned into a stable and flexible visualization platform. Not only can we visualize one of our key business metrics (service requests) in real time, but we also we have a framework that allows any developer to tap into a wealth of data and create new visualizations with relative ease. And with our Spring Fedex Friday just a month away, we’re looking forward to creating and sharing some more enhancements to both this platform and others.