OpenTracing: Microservices in Plain View

237

By Ben Sigelman (@el_bhs), OpenTracing co-author

Those building microservices at scale understand the role and importance of distributed tracing: after all, it’s the most direct way to understand how and why complex systems misbehave. When we deployed Dapper at Google in 2005, it was like someone finally turned the lights on: everything from ordinary programming errors to broken caches to bad network hardware to unknown dependencies came into plain view.

trace_screenshot_clear.png

A screenshot illustrating the multi-process trace of a production workflow

Everyone running a complex distributed system deserves — no, needs — this sort of insight into their own software. So why don’t they already have it?

The problem is that distributed tracing has long harbored a dirty secret: the necessary source code instrumentation has been complex, fragile, and difficult to maintain.

This is the problem that OpenTracing solves. Through standard, consistent APIs in many languages (Java, Javascript, Go, Python, C#, others), the OpenTracing project gives developers clean, declarative, testable, and vendor-neutral instrumentation. There are three constituencies who care about OpenTracing:

  1. Application developers want the flexibility to choose or swap out a tracing system without touching their instrumentation. They also need the instrumentation in their web framework to be compatible with the instrumentation in their RPC system or database client.

  2. Open-Source package developers need to make their code visible to tracing systems, but they have no way of knowing which tracing system the containing process happens to use. Moreover, for services and RPC frameworks, there’s no way to know specifically how the tracing system needs to serialize data in-band with application requests.

  3. Tracing vendors can’t instrument the world N times over; by using OpenTracing, they can achieve coverage across a wide swath of both open source and proprietary code in one fell swoop.

As OpenTracing gains traction with each constituency above, it then becomes more valuable for the others, and in this way it fosters a virtuous cycle. We have seen this at play with application developers adding instrumentation for their important library dependencies, and community members building adapters from OpenTracing to tracing systems like Zipkin in their favorite language.

opentracing_diagram.png

Last week, the OpenTracing project joined the Cloud Native Computing Foundation (CNCF). We respect and identify with the CNCF charter, and of course it’s nice for OpenTracing to have a comfortable – and durable – home within The Linux Foundation; however, the most exciting aspect of our CNCF incubation is the possibility for collaboration with other projects that are formally or informally aligned with the CNCF.

To date, OpenTracing has focused on standards for explicit software instrumentation: this is important work and it will continue. That said, as OpenTracing grows, we hope to work with others in the CNCF ecosystem to standardize mechanisms for tracing beyond the realm of explicit instrumentation. With sufficient time and effort, we will be able to trace through container-packaged deployments with little to no source code modification and with vendor neutrality. We couldn’t be more excited about that vision, and by working within the CNCF we believe we’ll get there faster.

This article originally appeared on Cloud Native Computing Foundation