Microservices are currently the “in” thing in software development, and for good reason. They help us build more loosely coupled, highly cohesive, modular systems that work well in an continuous deployment model. Deploying a small change to a single microservice is much less risky than deploying a huge monolithic application with several weeks or months worth of effort. At HomeAdvisor, we jumped on the microservice bandwagon a couple of years ago and have found it to be a (mostly) positive experiment.
One question many struggle with, including us, is this: What, exactly, does a microservice look like? That is, what are the important things to keep in mind when writing a microservice?
In our time working with this architectural pattern, we’ve learned a thing or two about what makes a good microservice. We’ve identified at least four main characteristics that all microservices should have: they have a single responsibility, share nothing, are monitored, and run as a cluster. In this post we’ll explore these in a little more depth.
Many times, the question is asked, “How big should a microservice be?” You may have heard an answer like, “No bigger than can fit in your head”. Perhaps you’ve seen answers that include heuristics around lines of code or number of classes.
The best answer I’ve seen? “As big as it needs to be.”
A microservice should have a single responsibility. That is, it does one thing, regardless of how “big” it gets. A microservice encapsulates a bounded domain context. It is responsible for a particular domain concept within your overall application.
Figuring out what the domain boundaries are can be difficult. In fact, you probably won’t get it 100% right the first time. Building microservices should be an iterative process. Adding functionality as needed and combining them or splitting them apart when it makes sense to do so.
One way to think about this was popularized by “Uncle” Bob Martin: Things that change together belong together. If you find that every time you want to change some functionality, you have to touch the same two or three microservices, that is probably an indication that the domain boundaries were incorrectly drawn. The functionality of those services should be combined into a common place.
If you think about a well designed class in an object oriented model, it encapsulates a particular concept by defining private and public data and methods. External users of the class only see the public methods. Developers are free to fiddle with the private stuff as long as the contract defined by the public members is maintained.
Consider microservices as really big classes. They have a public API in the form of REST endpoints, message queues, etc., as well as internal state. Anything inside the the microservice may be changed or updated, as long as those public contracts are maintained.
This means that a microservice should take a “share nothing” philosophy. This applies to implementation as well as its data.
As already mentioned, a microservice’s private implementation should be free to change at any time. If a microservice is relying on a shared library for it’s functionality, that becomes harder to enforce. I’m not talking about utility libraries with common string manipulation functions, collection helpers, and the like, but libraries containing domain specific business logic.
Libraries that contain domain logic should not be shared with other microservices and applications. Doing so makes it harder to update the internal logic of a microservice without affecting other applications. Even if the library is versioned appropriately, you still could end up in a state where some applications are running an outdated version of the desired business logic. Even if the logic is the same, there might be performance enhancements that are made which don’t get rolled out to all the applications using that logic.
Again, if multiple applications have to change at the same time, that’s usually a sign that the functionality should be pulled into a common place.
Not only should a microservice keep its implementation private, it should keep its data store private. How data is stored is an implementation detail. It’s not usually relevant to the public APIs it exposes.
In many organizations, ours included, the software has grown up around a single, shared database. This makes sense, as most software projects start out as a more monolithic application and grow from there. It can be expensive to set up a database and maintain it, even if you’re using an open source database. As a result all the functionality tends to use this central database, if for no other reason than “because it’s there.”
However, as you move into a microservice world, the domain boundaries become clearer. Which service owns what data is now an important consideration. With all the data in a single database, it’s very tempting to just join to the table that contains the information you need, rather than calling out to another microservice. But doing so just couples the two services in a way that may not have originally been intended.
In an ideal scenario, a microservice should be free to swap out its data store, with none of its clients the wiser. Maybe a microservice is fairly write-heavy, so moving from a relational database to Cassandra makes sense. As long as the public contract is maintained, it shouldn’t matter. But if other services are depending on those same database tables, it becomes harder to make that transition.
Even if you’re not changing database providers, making changes to the existing schema (adding columns, rearranging the table relationships, etc.) becomes harder and requires more coordination between services if they share a database table.
Think about it this way: if you share your database with other applications, that database schema has now become part of the public API for that service. Any changes to that data store must be backwards compatible with existing clients.
We started our journey towards microservices with only one or two microservices. As of this writing, we have close to 150 microservice instances running in production. Keeping track of a handful of microservices and how well they’re doing is one thing, understanding the state and health of 150+ is quite another. Netflix, who has pioneered much of the work around microservices, is reported to have 500+ microservices running.
This means that your monitoring systems must evolve. I once heard someone say, “If it isn’t monitored, it doesn’t exist.” This is especially true with microservices. Without proper monitoring, it would be very easy for a sick microservice to go unnoticed in the crowd of applications.
At HomeAdvisor, we leverage our service discovery mechanism, along with the DropWizard health check framework to watch the health of all our applications. Our systems can find where microservices are running, determine if we have enough of them, ask about their health status and report problems to Nagios so that the appropriate on-call people are notified.
In addition, log aggregation systems like SumoLogic and Application Performance Monitoring (APM) software, such as AppDynamics, allows us to see the health of our services over time, look for trends, outliers and dig into issues that appear.
Even though microservices keep their data private, they like to run in packs. Having a single point of failure in your overall system isn’t usually a good idea. So we spin up multiple copies of our applications. If one instance goes down, we have others to pick up the slack and serve traffic until our monitoring system can alert someone and get the missing instance restarted. Multiple copies running also means that if there is an increase in traffic, we can add additional instance to support the additional load. We may even choose to replicate data between the instances so that they’re not all reliant on a single data store.
The implication here is that microservices should be written with the assumption that more than one will be running. If there’s an operation that should only run once, such as a scheduled tasks, the members of the microservice cluster will need to coordinate among themselves using something like Apache ZooKeeper.
This is easy to forget during development, where there’s usually just the one instance running on the developers laptop.
For example, suppose your microservice runs a daily report and emails the results to a distribution list. In development, this works fine. The report is run and the email is received. Test passed! Commit to master, we’re done! But then the microservice hits an environment (hopefully a test environment) and everyone gets three copies of the report! This scenario is quite likely unless the microservice is built with the understanding that there will be more than one running.
Microservices are a great architectural pattern, when you’re ready for it. In any architectural pattern, there are some things that are important to keep in mind. With microservices, it’s no different. There is much to learn. While we’re still learning ourselves, we’ve discovered some of the essential microservice characteristics. Some follow well known patterns such as having a single responsibility and encapsulation. Others, while not new, are much more important when working in the world of microservices, such as monitoring and clustering of services.
Hopefully this will help as you too work to implement microservices in your own organization.