Skip to content

OpenTelemetry OTel Tracing Observability

OpenTelemetry Instrumentation for Vendor-Neutral Distributed Observability

OpenTelemetry unifies traces, metrics and logs under a single open standard, eliminating the proprietary-agent lock-in that raises costs every time you switch backends. We instrument services written in Go, Java, Python, Node.js and .NET with OTel SDKs, configure the Collector pipeline for PII redaction and tail-based sampling, and export signals to any backend — Prometheus, Tempo, Jaeger or Datadog — for US and EU clients in regulated industries.

Get a proposal See cases

OpenTelemetry unifies traces, metrics and logs under a single open standard, eliminating the proprietary-agent lock-in that raises costs every time you switch backends. We instrument services written in Go, Java, Python, Node.js and .NET with OTel SDKs, configure the Collector pipeline for PII redaction and tail-based sampling, and export signals to any backend — Prometheus, Tempo, Jaeger or Datadog — for US and EU clients in regulated industries.

Challenges

Industry challenges we solve

PII leaking into spans and attributes

Developers instrument spans with request parameters, user IDs or payload fields without realising those values contain regulated personal data. Once in the backend, PII is hard to purge and may violate GDPR or HIPAA data-handling requirements.

Instrumentation overhead and performance impact

Naively instrumenting every function call or creating high-cardinality spans (per-user, per-request-ID labels) inflates memory, CPU and network usage. Poorly tuned sampling lets 100 % of traces through and saturates the Collector and backend storage.

Sampling strategy — head vs tail trade-offs

Head-based sampling decides at trace start whether to record, so it misses rare but important error paths. Tail-based sampling buffers the full trace before deciding, adding latency and memory pressure in the Collector. Choosing the wrong strategy leaves critical traces missing or storage costs excessive.

Context propagation across heterogeneous services

A missing or corrupted W3C TraceContext header breaks the trace at the first service boundary, producing disconnected spans that cannot be correlated. Polyglot stacks — Go gateway, Java service, Python ML worker — each require language-specific propagation configuration.

Vendor lock-in from proprietary agents

Proprietary agents (Datadog agent, New Relic APM) embed vendor-specific APIs into application code. Switching backends requires code-level changes, and the agent binary itself can introduce licensing, security and dependency-management complexity.

OTel Collector pipeline complexity

The Collector supports receivers, processors and exporters in composable pipelines, but misconfigured processor order (e.g., batching before filtering) causes data loss or excessive memory use. Multi-backend fan-out and environment-based routing add further configuration surface area.

Solutions

Solutions we build

Vendor-neutral SDK instrumentation

We instrument services using official OTel SDKs and auto-instrumentation agents, emitting OTLP-format traces, metrics and logs with no proprietary API in application code — backends are swappable without code changes.

Collector pipeline with PII redaction and routing

We design OTel Collector pipelines with attribute processor rules that redact, hash or drop regulated fields before export, ensuring telemetry is clean for GDPR and regulated-data requirements across all downstream backends.

Tail-based sampling configuration

We configure tail-based sampling policies in the Collector that record 100 % of error and slow traces while sampling down routine traffic — capturing every anomaly without blowing storage budgets.

Auto-instrumentation plus targeted manual spans

Auto-instrumentation covers frameworks (HTTP, gRPC, database, messaging) out of the box; we add targeted manual spans for business-critical code paths — payment flows, ML inference calls, regulatory events — where framework-level tracing is insufficient.

Context propagation across polyglot services

We configure W3C TraceContext and Baggage propagation in every language runtime, test propagation end-to-end in the CI pipeline and validate trace continuity across service boundaries using synthetic distributed test scenarios.

Backend-agnostic export to Prometheus, Tempo, Jaeger, Datadog

OTLP exporters in the Collector fan signals out to one or more backends simultaneously. Clients can run Jaeger or Tempo on-premises for regulated data and mirror non-sensitive metrics to Datadog — no instrumentation change required.

Stack

Technology stack

OpenTelemetry SDKs (Go, Java, Python, Node.js, .NET), OTel Collector, OTLP, auto-instrumentation agents, manual span instrumentation, context propagation (W3C TraceContext / Baggage), semantic conventions, exporters (Prometheus, Grafana Tempo, Jaeger, Datadog, OTLP/gRPC), tail-based sampling, Collector processors (PII redaction, attribute filtering, batch), sampling policies.

Compliance

Compliance & regulations

PII kept out of telemetry · vendor-neutral export · trace audit trail · NIS2 end-to-end observability

EU

  • GDPR — Collector processor pipeline redacts PII from span attributes and log bodies before export; no personal data leaves the instrumentation layer.
  • EU AI Act — LLM and model inference calls are traced with input/output metadata for lineage and governance audit without capturing raw personal prompts.
  • NIS2 — end-to-end distributed tracing provides continuous observability across the supply chain, supporting incident detection and reporting obligations.
  • DORA — trace-driven resilience metrics (error rate, latency percentiles, dependency health) feed the operational resilience dashboard required under DORA RTS.

US

  • SOC 2 — trace sampling governance and immutable export pipelines provide an auditable record of system behaviour for SOC 2 Type II evidence.
  • Vendor neutrality — no proprietary instrumentation agent in the binary; switching observability backends requires only Collector exporter reconfiguration, not code changes.
  • PII redaction pipeline — Collector attribute processor strips or hashes regulated fields (SSNs, emails, card numbers) before spans reach any backend, keeping telemetry clean for regulated clients.
  • RBAC at backend — trace and metric data is scoped per service and environment; RBAC policies at the backend (Grafana, Jaeger) restrict access to sensitive service traces to authorised roles.

Why YuSMP

Why engineering teams choose YuSMP for OpenTelemetry instrumentation

No proprietary API in your codebase

Every span and metric is emitted via the open OTel API. Switching from Datadog to Grafana Tempo or adding a second backend is a Collector configuration change — not a refactoring sprint across dozens of services.

PII-safe telemetry from day one

Our Collector pipeline design treats PII redaction as a first-class concern, not an afterthought. Regulated fields are stripped or hashed before any signal reaches an external backend, keeping your telemetry compliant with GDPR and similar frameworks.

Full signal coverage — traces, metrics and logs correlated

We instrument all three signal types and configure exemplars linking Prometheus metrics to underlying traces, so engineers jump from a latency spike on a dashboard directly to the offending trace without context switching.

FAQ

OpenTelemetry Instrumentation FAQ

What is OpenTelemetry and how does it differ from proprietary APM agents?

OpenTelemetry is a CNCF project that defines a vendor-neutral API, SDK and wire protocol (OTLP) for traces, metrics and logs. Unlike proprietary agents — Datadog APM, New Relic, Dynatrace — it embeds no vendor-specific code in your application. You instrument once using the open OTel API and route signals to any compliant backend via the Collector. Switching backends requires only Collector exporter configuration, not application code changes.

What is the difference between traces, metrics and logs in OTel?

Traces record the end-to-end journey of a single request across services — each operation is a span with timing, attributes and status. Metrics are numeric aggregations over time (request rate, error rate, latency percentiles) suited for dashboards and alerting. Logs are timestamped text or structured events from individual components. OTel unifies all three under one SDK and wire protocol, and exemplars link metric data points directly to the traces that produced them.

How does OpenTelemetry handle PII in span attributes and log bodies?

OTel itself does not redact PII — that is the responsibility of the pipeline. We configure the OTel Collector's attribute processor and transform processor to drop, hash or mask span attributes and log fields that may contain regulated data (emails, user IDs, card numbers, SSNs) before signals reach any backend. This keeps telemetry compliant with GDPR and similar frameworks without requiring application-level instrumentation changes.

What is the difference between head-based and tail-based sampling?

Head-based sampling decides at the start of a trace whether to record it — fast and memory-efficient, but it discards rare error traces with the same probability as routine ones. Tail-based sampling buffers the complete trace in the Collector before deciding, allowing policies such as "always keep traces with errors or latency above 1 s". We configure tail-based sampling for production systems where missing error traces is more costly than the additional Collector memory and CPU.

What does the OTel Collector do and do I need it?

The OTel Collector is a vendor-agnostic agent that receives OTLP (or other formats), processes signals — batching, filtering, attribute transformation, PII redaction, tail-based sampling — and exports to one or more backends simultaneously. You can export directly from SDKs to a backend, but the Collector decouples instrumentation from backend choice, centralises sensitive data handling (PII redaction) and enables fan-out to multiple backends without application changes. We recommend it for any production deployment.

What is the performance overhead of OTel instrumentation?

Overhead depends on sampling rate and cardinality. With head-based sampling at 10 % and well-tuned span attributes (no high-cardinality labels like user ID per span), CPU overhead is typically under 2 % and memory impact is minimal. Auto-instrumentation agents add library-loading cost at startup. Tail-based sampling in the Collector adds memory proportional to the buffer window. We profile instrumented services before and after deployment and adjust sampling policies to keep overhead within agreed SLOs.

How do you migrate from a Datadog or New Relic agent to OpenTelemetry?

Migration is incremental. We start by running the OTel Collector alongside the existing agent, routing a duplicate OTLP stream to a trial OTel-compatible backend while the proprietary agent continues in production. Once signal parity is confirmed — trace coverage, metric cardinality, alert fidelity — we remove auto-instrumentation from the proprietary agent, leaving only OTel SDKs in the application. The Collector can still export to Datadog via its OTLP exporter if Datadog is retained as a backend, so there is no hard cutover.

Instrument your distributed system with senior OTel engineers — vendor-neutral from day one

Response within 1 business day. NDA on request.

Get a proposal

Get a proposal

Share a few details and a senior consultant will reply within one business day.