Making the Most Out of Microservices with Service Mesh

623

In this article, we talk with Andrew Jenkins, Lead Architect at Aspen Mesh, about moving from monolithic apps to microservices and cut through some of the hype around service mesh for managing microservice architectures. For more on service mesh, consider attending KubeCon + CloudNativeCon EU, May 2-4, 2018 in Copenhagen, Denmark.

1. Microservices are solving many of the problems companies face with monolithic architectures. Where do you see the greatest value?

Andrew JenkinsTo me, it’s about minimizing time-to-user-impact.  The shift to virtualization and then cloud was all about reducing the complexity associated with all the infrastructure for supporting an app, so that you can flexibly allocate servers and storage and so on.  But that shift didn’t necessarily change the apps we build. Now that we’ve got flexible infrastructure, we should build flexible apps to take full advantage of it.

Microservices are those flexible apps – build small, single-purpose blocks and build them rapidly so you can get them in end user’s hands quickly.  Organizations can use this to test against real user requirements and build iteratively.

2. As enterprises make the move from monolithic apps to microservices the benefits are clear, but what are some of the challenges companies are running into as they make the move?

Jenkins: Shifting to microservices doesn’t by itself eliminate complexity.  The complexity in any one microservice is small but there is complexity across the entire system.  Fundamentally, companies want to know which service is talking to which, about what, on behalf of whom, and then be able to control that communication with policy.

3. How are organizations attempting to address these challenges?

Jenkins: Some companies add this visibility and policy piece into every application that they build, from day one.  This is especially common when a company invests in custom tooling, workflows, deployment managers and CD pipelines.  Also we find these are usually companies that orient themselves around a few languages and write nearly everything they run themselves.

If your app stack is polyglot and a combination of new development and migrating existing applications, it’s harder to justify adding these pieces to every app individually.  Apps from different teams and externally-developed apps raise this bar more. One approach is to treat those non-conforming apps separately – putting them behind a policy-enforcing proxy or treating them as more of a black box from a visibility perspective.  But, if you don’t have to make this separation, if there’s instead an easy way to get that native-style policy and visibility for any app in any language, then you can see the advantage there. A service mesh is one approach for this.

4. There is a lot of hype around service mesh as the ultimate solution to manage microservice architectures. Your thoughts?

Jenkins: Yeah, it is definitely climbing the hype cycle curve.  It’s not going to be perfect for every situation. If you already have microservices and you feel like you’ve got really good control and visibility, you’ve got a good developer workflow dialed in, then you don’t need to rip everything out and cram in a service mesh tomorrow.  I’d suggest you might still want to understand what’s inside since it may be helpful when your team tackles new languages or environments.

I think we should understand how service mesh communizes functionality into a consistent layer.  We all love to keep our code DRY (don’t-repeat-yourself). We know that two look-alike implementations are never quite the same.  If you can leverage a service mesh to get one implementation of, say, retry logic that works across your entire infrastructure, that really simplifies things for developers, operators, everyone who works with that system.  I bet no one on your team wants to write yet another copy of the retry loop, and especially no one wants to debug the subtle differences between the one written in go and the one written in python.

5. As the amount of services to monitor increases, each of these is highly likely to:

– Use different technologies / languages

– Live on a different machine / container

– Have its own version control

How does a service mesh address these disparities?

Jenkins: Service mesh’s first promise is to do the same thing (that visibility and control piece) for microservices written in any language, for any application stack.  Next, when you think about different containers talking to each other, there’s a lot that could be relevant at that layer that a service mesh could help with. For instance, do you believe in securing each individual running container rather than perimeter (firewall) security?  Then use a service mesh to provide mTLS from container to container.

I’m also seeing that version control differences are the manifestation of deeper application lifecycle differences.  So this team uses such-and-such version control, an extensive qualification phase and careful upgrade strategy because they’re providing one of the most core services that everyone relies on.  Another team working on a brand new prototype service has a different policy but you for sure want to ensure they’re not writing to the production database, say. Fitting their “square peg workflow” into your “round hole process” isn’t the right thing.

You can use a service mesh to graft these different apps and services into the system in a way that’s appropriate for them.  Now obviously you want to use some judgement and not make bespoke pegs for every single little microservice but we’re hearing a lot of interest in service mesh to help smooth out the differences between these lifecycles and expectations.  Again, it’s all about providing that rapid iterability but without giving up the visibility and control.

6. Control plane vs data plane: where does service mesh provide value for each?

Jenkins: It’s remarkable how easy it is to start making a web service today.  You can fit the code in a tweet. This isn’t a real web service, though.  To make it resilient and scalable you need to add some stuff to the data plane of the app.  It needs to do TLS, and it needs to retry failures, and it needs to only accept requests from this service but not that one, and it needs to check the user’s authentication, and so on.  A service mesh can help you get that data plane functionality without having to add code to the app.

Also, since that’s now in the data plane layer, there’s an ability to upgrade and enhance that layer without modifying the application.

A service mesh brings consistency to the control plane for your microservices.  Container orchestration systems like Kubernetes provide a common way of describing what containers you want running.  It’s not that you can’t run containers without them, it’s that once you’re beyond running a handful of containers, you want a consistent way to run them all.  Service mesh is like that, for the communication between containers.

7. The buzzword around service mesh is “observability”. Can you share a bit on the real world benefits observability provides?

Jenkins: We’ve talked to one team that told us about a time they spent hours on the phone trying to solve some issue that spanned lots of services and components.  They had collected lots of data from each service, and they knew the answer was in that sea of data somewhere. But they spent so much time translating between each snapshot of the information.  They didn’t have confidence that each step in that translation was correct – after all, if they understood what was going on, they would have engineered out the problem in the first place. On top of this, it isn’t always clear where to should start looking.

What they asked for was one view – all the information across services collected in one place, and the most important information for their issue right at the top.  Again, service mesh is not a panacea, and I won’t promise that you’ll never have to look at a log file again. But my goal would be that once this team has a service mesh, they are always confident that they’ve got good observations on what went into and out of every microservice, and the service mesh has already pointed them in the right direction.

To me, observability is about more than just collecting a lot of datapoints.  This is about getting the smart brains applied to the true fault in the system as quickly as possible.

8. What do you see for the future of service mesh?

Jenkins: I think that the various implementations are providing a compelling toolbox of policies and components.  I’m glad that we’re leveraging lessons learned from pioneers of microservices in building this common service mesh layer.

The next step is going to be choosing how to use that toolbox to solve problems.  Organizations are going to want some consistency in what policies get deployed: The challenge will be to combine the interests of app devs, InfoSec, and platform teams so that all their policy comes together in the service mesh.

On a bit of a technical nuance, we’ve seen service meshes that leverage what’s called a Sidecar model for integrating and service meshes that do not.  A sidecar feels natural for an app enhancement layer, but we’re not used to that for layers that we consider infrastructure.

Once we write our apps from day one to rely on this service mesh, we’ll have the opportunity for fine-grained but high-level control over applications.  Every app will have advanced retry logic, security, visibility, etc. built in from day one. First, that’s going to change the way we develop and test applications.  I think it’s also going to open doors for cross-application policies we haven’t thought of yet.

Learn more at KubeCon + CloudNativeCon Europe, coming up May 2-4 in Copenhagen, Denmark.