When teams collect data without full observability of what others on the team can see, it becomes clear that no one’s picture is truly accurate. In this picture, all of the people are wearing blindfolds and feeling around to see what is in front of them. One thinks this creature is a spear, another thinks it is a tree trunk, and another a rope. As long as they cannot observe what the others can, there is poor data fidelity.
Why unified collection matters
In the ITOps world, infrastructure engineers should not need to enact application changes due to sending data to various disparate backends. Once a disparate system becomes much larger, particularly in federated models, standardization around telemetry becomes very important.
What if you want distributed tracing data to be analyzed somewhere other than the tracing backend? What if your team wants to route logging data to a security analyst in a format other than a flat CLI in the logging back end? People want histograms, charts, and graphs using machine intelligence to analyze and parse this data. This is only possible assuming a unified telemetry standard that is vendor-agnostic.
On the horizon, right now, is the widespread adoption of the Open Telemetry (OTel) standard. This provides the first concrete effort to establish a unified collection method for logs, metrics, traces, and spans.
What is a telemetry pipeline?
Telemetry pipelines route data (logs, metrics, and traces) from where it is generated to wherever it needs to go, which is often on disparate backends. This also intersects with how this routing data is expressed via monitoring. Telemetry pipelines not only filter this data, but produce enriched metadata that is available to various backends regarding disparate information (such as Kubernetes container information, region information, GeoIP information, and other logging and trace data)
In the image above, you see an example of a simplistic telemetry pipeline and its corresponding components. An extremely basic setup has a receiver, buffer, and exporter. The receiver is how you get data into the collector, at the start of a telemetry pipeline. Our distro supports a wide variety of receivers and data formats.
Typically, pipelines chain buffers and processors, which take data from a buffer to analyze or transform the data. Our OTel distro also supports dozens of processors and the list is ever-growing.
Many developers want to automate compliance checks, for example, on deployments. They use software that takes data from exported logs and goes through various compliance and security checks (like GDPR and HIPAA) via chains of buffers and processors before finally exporting a result. As you can see, a pipeline can get extremely complex, but typically keeps the pattern of receiver, buffer, processor, and exporter for every type of backend.
Learn more about components (receivers, processors, exporters) included in our OTel Distro. The highlighted telemetry pipeline components are delivered by Sumo Logic, and the ones with an asterisk are upstream OpenTelemetry components with additional contributions from us.
What is OpenTelemetry?
Open Telemetry (usually called OTel) is an open standard for data collection that is vendor-agnostic. As of the publication of this article, OTel is the second most active Cloud Native Computing Foundation (CNCF) project, after Kubernetes, signifying its importance to improve open-source data collection, portability, control, and support. Moreover, the prevalence and size of this project signal the success it is experiencing as the premier industry-wide standard for software instrumentation (and vendor-neutral APIs).
OTel comes from unifying projects OpenCensus (originally from Google) and OpenTracing (from CNCF). OTel captures traces, metrics, logs, and other application telemetry data to any backend of your choice.
For more information, check out this history on OTel from the CNCF.
Data collection without OTel:
This image represents Observability without the OTel standard in place (Observability 1.0). Here we see a hodgepodge of vendor-provided collection, instrumentation, and back-ends.
There are a lot of moving parts at each layer. Additionally, since there are several instrumentation libraries available through different vendors when organizations want to switch from one to another, it can be problematic. Organizations might have already invested some time in writing integrations using disparate APIs with unique configurations. This would need to be redone when switching between libraries.
What a headache! Users have options like running a lot more ingress/egress or re-working architecture every time they choose to work with a new library or agent.
Wouldn’t it be much easier if there were a standard that vendors adhered to which allows for a more unified collection method?
Data collection with OTel:
I know that a unified collection methodology is a unicorn and the community is not fully there today, but Sumo Logic is well on our way to the diagram on this page with OTel. When we discovered the OpenTelemetry standard, we knew that we would want to standardize all of our collection on this new broadly adopted standard to benefit our customers.
We planned a two-step journey to our new collection strategy. For us:
1) Observability 1.0 is defined by a world without unified collection, and without using the OTel standard.
2) Observability 2.0 is defined by using the Open Telemetry standard (with OTLP, for both instrumentation libraries and collection agents). Overall, we have not fully moved from 1.0 to 2.0 yet, but we are much closer to getting there as a community thanks to the OpenTelemetry standard.
If you recall, the prior diagram requires a separate library and agent for each of the signals (Observability 1.0). This is not the case when using OpenTelemetry (Observability 2.0), where almost everything except the backend is streamlined and provided by the initiative. There are common libraries and collectors. The vendors can provide their own distributions, but they all share the same core code.
How does Sumo innovate with OTel?
Sumo Logic is betting our future on OpenTelemetry. We were one of the first observability vendors to support OTel tracing. Our Real User Monitoring (RUM) platform is an OTel JS distribution and fully compatible with the standard. We ensure that both spans and logs are transmitted leveraging the OTel Line Protocol (OTLP). Also, our platform supports easily ingesting telemetry from a variety of existing vendors (such as Telegraf, Prometheus, Jaeger, etc).
We not only run a unified, supported agent for collecting metrics, events, logs, and traces, but we contribute upstream to the OTel project. Our OTel distro works upstream to update receivers for Linux Host Metrics MySQL, Apache, and ElasticSearch to collect metrics and metadata for improved telemetry processing. Additionally, our team created an auto-instrumentation library for the Go language and PHP, allowing customers to create and observe traces for Go-based applications.
The future of OTel is you!
Developers, engineers, and others around the world are benefiting from the OpenTelemetry project. Standardization around common metrics, events, logs, and trace formats allows for this data to be utilized by big data tools to produce analytics in a human-readable format. We are just one vendor that is well on our way in this standardization journey, but I hope that the community continues to grow and contribute to the project.
If you are interested in helping out, please check out the GitHub project. Each repository usually has specific instructions and guidance on how to contribute via a CONTRIBUTING.md document. Also, please join our community Slack channel if you have any questions about OTel or want to be a part of our community.
By Melissa Sussmann
Source: CNCF Blog