Design and Implementation of Microservices by @samnewman (@NDC

SOAP or HTTP-services?

HTTP-services (sometimes referred to as REST services, even though they aren't following the required REST constraints) benefit a lot more from HTTP infrastructure than SOAP services. SOAP services tend to get easier to start with, however, because of rigorous frameworks in most languages and on most platforms. Sam argued that RPC gets you started quicker - bigger initial bang for the buck - but that you pay for it later with greater complexity, as you need to address scaling, and that the reverse is true for HTTP-services. I agree.

"Don't let the data model affect your API or Service Model", was his next advice. "Try to go the first few iterations without any persistence at all" (or write to a local, plain, text file). This is to hammer out your API and service interaction before locking yourself down to a storage model.

Splitting Services

Why would you want to split your services? Maybe your service have grown out of hand and there's a separate team who could be responsible for a part of its functionality. Here's, how it's done:

Find the functionality that you want to split out
If you're on the .NET platform, there's a tool for that! - NDepend (which have reached version 6 at the time of writing this post).
Group functionality into modules (namespaces in .NET land)
For these refactorings, I use Resharper.
Extract your modules on some form of drawing surface and draw relationships between them.
Question dependencies - do your code dependencies make sense in the real world (i.e. are they applicable in your domain)? Should they belong together?

What about databases?

Separate your database (by introducing views, abstracting access through stored procedures), before splitting out your service (this is the equivalent of the group functionality into modules-step for your code). Move all shared repository code into each of your modules, eliminating any common data access layer you might have from a previous world.

Sam recommended us to read the book Refactoring Databases: Evolutionary Database Design and to look at Schemaspy to visualize our relational dependencies.

When extracting your service, don't be afraid of increasing the amount of RPC-calls and database operations: Yes, we are making our operations a bit slower, but are we making them too slow? To answer that question, we need to start measuring our execution times. The benefits we are after here, are not related to application performance, but to modularity, autonomy and service orientation.

Many of us, have reference data in our databases - things like country codes. Do we update this data often enough to warrant database storage? If not, consider creating an enum representation of the data in your code instead. If you have a larger amount of reference data, introduce its own information service.

Next, Sam showed us a diagram similar to the one below, where two services were acting on the same database record:

"What does this diagram tell us?", he asked the class. We need to Reify by identifying and introducing the lacking concept - Customer - letting the finance service maintain its finance records (referring to the customer by its Customer Id) and letting the warehouse service maintain its warehousing records, again referring to the Customer by its Id:

Reification is the process by which an abstract idea about a computer program is turned into an explicit data model or other object created in a programming language. A computable/addressable object — a resource — is created in a system as a proxy for a non computable/addressable object. By means of reification, something that was previously implicit, unexpressed, and possibly inexpressible is explicitly formulated and made available to conceptual (logical or computational) manipulation. -- Wikipedia

As you go down the path to find services to extract, you are going to find some parts that are more integrated than others. Do yourself a favour and do not start with the most intricate parts. Start small:
Draw conceptual boundaries on a whiteboard, then address your code structure (refactor your code into modules), then start looking at service boundaries. Look for stateless things, things with low coupling.

Strangler pattern

Sometimes, it's easier to just write a new service than to move its code. In chapter 4 of Sam's book - Building Microservices - when talking about integration patterns, he describes using the Strangler Application Pattern to "capture and intercept calls to the old system" allowing you to "decide if you route these calls to existing, legacy code, or direct them to new code you may have written. This allows you to replace functionality over time without requiring a big bang rewrite".

Continuous Integration

To achieve continuous integration, we need to

Check in code to our main branch at least once a day
Run all our tests as we check in
When the build breaks, it's your team's top priority to fix it.

Warning: By keeping the server and client in the same software repository, you can encourage introducing breaking changes, since they are not visible as you check in the change to both sides of the network boundary in one go.

It's of paramount importance that you build your artifact once (as opposed to build it once for test and then once for production). If you don't, you cannot be 100% sure it's the same code in the various environments (debug vs. release builds, compiler flags and more). Move your created artifacts through your environments as they mature. To accomplish this, we need to keep our configuration separate from our build.

Packaging our artifacts as OS-specific packages (RPM, MSI ...) introduces a very comfortable abstraction layer which benefits from a lot of existing, specially adapted tools.

When we have graduated from OS-specific packages, Sam recommends us to look into custom images, such as Amazon Machine Images, pre-configured virtual machines or containers, such as Docker containers.

Packer is a tool for creating machine and container images for multiple platforms from a single source configuration.
Vagrant is a tool for building complete development environments

The immutable server pattern can be implemented with containers, since you'll be removing physical access to the server. If you need to make a change to your environment - running Docker - you will change the dockerfile and re-create the environment from that file. Thus, an always up-to-date description in your environment will be safely stored in your source control.

As with any technology, we will benefit highly from understanding how it really works before we utilize it. As of today, there are some security considerations we need to be aware of when going into container land. For example, in his post on Security Risks and Benefits of Docker Application Containers, product management director at NCR Corp and security enthusiast, Lenny Zeltser says that

As of this writing, Docker does not provide each container with its own user namespace, which means that it offers no user ID isolation. A process running in the container with UID 1000, for example, will have the privileges of UID 1000 on the underlying host as well.

He also mentions that "Docker aims to address this limitation, so that a container’s root user could be mapped to a non-root user on the host", which means that there are workarounds, but that you need to know about them, if you are going to have a continued productive time with this tool.

Sam mentioned that 30% of the currently "trusted" packages in the Docker hub (the public docker container image repository) contained critical security flaws last time he checked. Thus, he urged us, to "think twice before reusing public packages".

Boot2Docker

The Docker Engine uses Linux-specific kernel features, so to run it on Windows or OS X we need to use a lightweight virtual machine (VM). The Boot2Docker application includes a VirtualBox Virtual Machine (VM), Docker itself, and the Boot2Docker management tool. You use the Windows/OS X Docker Client to control the virtualized Docker Engine to build, run, and manage Docker containers.

Although you will be using a Docker client for your specific OS, the docker engine hosting the containers will still be running on Linux. Until the Docker engine for Windows is developed, you can launch only Linux containers from your Windows machine.

By dedicating one service to one host, we can independently assign hardware resources, decide on patch levels and runtimes, the latter ensuring that we can upgrade runtimes on services that require new features, while letting other services maintain their stability by not updating theirs.

Terraform and Azure Resource Manager

Sam recommended us to take a look at Terraform, which pitches itself as providing "a common configuration to launch infrastructure — from physical and virtual servers to email and DNS providers. Once launched, Terraform safely and efficiently changes infrastructure as the configuration is evolved." -- terraform.io. From its description, it sounds similar to Azure Resource Manager. In fact, it might've been an inspiration, seeing as ARM was quite recently announced.

Service location - DNS

Talking across our system when it's separated by individual service nodes requires some form of strategy. One strategy could be to use convention-based DNS records to, say, identify the inventory service in the development environment by inventory-dev.mycompany.com. However, changes to DNS records can be slow (with a default Time To Live value of 300 seconds, i.e. 5 minutes), they are often hard to configure and, most despicably in this case: They are often cached - in multiple locations! Sam mentioning the JVM having a DNS cache in itself. Luckily, I couldn't find anything that said that .NET framework did the same.

Service discovery

In a service discovery environment, services registers themselves against a known endpoint during startup. This registration service could handle subscriptions too so that it could update consumers whenever a new service registers itself with a given name. One tool that does this, and provides configuration management on the side, is Consul. It is a DNS server in itself and offers a REST API that can be utilized to read endpoints and configuration settings, such as DB connection strings. With it (or your own configuration service), you could publish (turn on) feature flags to a percentage of your system for A/B-testing. For more information in this general area, Sam recommended us to look into Aphyr's blog.

Continue to part 3...

Search This Blog

Development Experience(s)