OpenTelemetry for Web RUM: Standardizing Frontend Performance Telemetry
Modern web performance monitoring requires vendor-neutral, standardized instrumentation. OpenTelemetry for Web RUM provides a unified specification for capturing Real-User Monitoring (RUM) & Core Web Vitals Tracking data directly from the browser. By adopting the OpenTelemetry JavaScript SDK, engineering teams can replace fragmented vendor scripts with a single, extensible telemetry pipeline that aligns with broader observability strategies and reduces client-side payload overhead.
Pipeline Architecture & Data Flow
A production-grade implementation begins with understanding the broader RUM Architecture, Tooling & Self-Hosting landscape. OpenTelemetry decouples instrumentation from data transport, allowing teams to route browser spans and metrics to any backend. The pipeline consists of client-side SDK initialization, automatic span generation for navigation and resource timing, and an exporter that forwards structured telemetry to a collector or analytics warehouse. This modular design enables seamless integration with existing backend tracing systems.
Data Flow Workflow:
- Browser Instrumentation: The Web SDK intercepts
PerformanceObserverAPIs and DOM events. - Span Generation: Automatic instrumentations create
Spanobjects with standardized attributes (http.url,navigation.type,web.vitals.*). - Batching & Export: The
BatchSpanProcessoraggregates spans and flushes them viaOTLPTraceExporterusing HTTP/JSON or gRPC. - Collector Ingestion: A centralized OpenTelemetry Collector receives payloads, applies processors (e.g.,
attributes/insert,filter), and routes to storage.
SDK Initialization & Core Web Vitals Capture
Deploying the Web SDK requires careful initialization to avoid main-thread blocking or layout shifts. Engineers must configure the @opentelemetry/instrumentation-document-load and @opentelemetry/instrumentation-user-interaction plugins to automatically capture LCP, INP, and CLS metrics. Proper attribute mapping ensures that business-critical context propagates alongside raw performance timings. For step-by-step initialization and plugin configuration, refer to Configuring OpenTelemetry for frontend performance.
Production Initialization Example:
import { WebTracerProvider } from '@opentelemetry/sdk-trace-web';
import { Resource } from '@opentelemetry/resources';
import { BatchSpanProcessor } from '@opentelemetry/sdk-trace-base';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
import { DocumentLoadInstrumentation } from '@opentelemetry/instrumentation-document-load';
import { UserInteractionInstrumentation } from '@opentelemetry/instrumentation-user-interaction';
import { WebVitalsInstrumentation } from '@opentelemetry/instrumentation-web-vitals';
import { registerInstrumentations } from '@opentelemetry/instrumentation';
const provider = new WebTracerProvider({
resource: new Resource({
'service.name': 'frontend-web-app',
'deployment.environment': 'production',
'browser.name': navigator.userAgentData?.brands?.[0]?.brand || 'unknown',
}),
});
provider.addSpanProcessor(
new BatchSpanProcessor(new OTLPTraceExporter({ url: '/v1/traces' }))
);
provider.register();
registerInstrumentations({
instrumentations: [
new DocumentLoadInstrumentation(),
new UserInteractionInstrumentation({
eventNames: ['click', 'keydown', 'pointerdown'],
shouldPreventSpanCreation: (eventName, element) => {
// Ignore non-interactive elements to reduce noise
return element.tagName === 'SCRIPT' || element.tagName === 'LINK';
},
}),
new WebVitalsInstrumentation({
reportOnFirstChange: false,
sendToCollector: true,
}),
],
});
Beacon Routing & Payload Delivery
Once telemetry spans are generated, reliable delivery is critical. Browsers use the Beacon API to send payloads during page unload without blocking navigation. When operating a Self-Hosted Beacon Collection endpoint, teams must implement CORS handling, payload batching, and exponential backoff to prevent data loss under poor network conditions. OpenTelemetry’s OTLP exporter can be configured to serialize spans into JSON or Protobuf before transmission, ensuring compatibility with downstream processing layers.
Beacon-Optimized Exporter Configuration:
// Fallback to sendBeacon for page unload scenarios
const beaconExporter = new OTLPTraceExporter({
url: 'https://rum-collector.internal.example.com/v1/traces',
headers: { 'Content-Type': 'application/json' },
concurrencyLimit: 1,
timeoutMillis: 10000,
});
// Custom processor to force beacon on visibility change
provider.addSpanProcessor({
onStart: () => {},
onEnd: (span) => {
if (document.visibilityState === 'hidden') {
try {
navigator.sendBeacon(beaconExporter.url, JSON.stringify([span]));
} catch (e) {
// Fallback to IndexedDB queue for payloads > 64KB
persistToQueue(span);
}
}
},
});
Debugging Workflows & Comparative Tooling
Effective RUM analysis requires correlating frontend spans with backend traces. When investigating Core Web Vitals regressions, engineers should filter by slow navigation spans, inspect resource waterfall breakdowns, and cross-reference INP event handlers. While commercial platforms offer out-of-the-box dashboards, a SpeedCurve vs Custom RUM evaluation often reveals that OpenTelemetry-based pipelines provide superior flexibility for custom alerting, statistical modeling, and integration with existing observability stacks.
Trace Correlation Workflow:
- Identify Outliers: Query spans where
web.vitals.lcp > 2500orweb.vitals.inp > 200. - Propagate Context: Ensure
traceparentheaders are injected into subsequent XHR/fetch requests using@opentelemetry/instrumentation-fetch. - Join Traces: In your backend (e.g., Jaeger, Tempo, or ClickHouse), join
frontend.span_idwithbackend.parent_idto reconstruct the full request lifecycle. - Analyze Waterfall: Extract
resource.timingattributes to identify TTFB bottlenecks vs. render-blocking assets.
Production Scaling & Advanced Analysis Patterns
At scale, raw telemetry volume requires strategic filtering. Implementing RUM Data Sampling Strategies ensures cost control while preserving statistical significance for outlier detection. Privacy-Compliant Tracking mandates must be enforced at the SDK level by stripping PII from custom attributes before export. For global applications, Geographic Performance Breakdowns and Device Tier Analysis should be computed via backend aggregation pipelines rather than client-side processing. Finally, Enterprise RUM Scaling relies on horizontal collector deployments, Kafka buffering, and columnar storage to maintain query performance across billions of spans.
Backend Aggregation & Sampling Pattern:
# otel-collector-config.yaml (Sampling & Routing)
processors:
probabilistic_sampler:
sampling_percentage: 10
hash_seed: 42
attributes:
actions:
- key: "user.email"
action: delete
- key: "http.url"
action: hash
hash:
algorithm: "SHA256"
exporters:
clickhouse:
endpoint: "tcp://analytics-db:9000"
database: "rum_metrics"
ttl_days: 30
service:
pipelines:
traces:
receivers: [otlp]
processors: [probabilistic_sampler, attributes]
exporters: [clickhouse]
Data Analysis Query Pattern (ClickHouse SQL):
-- P95 LCP by Device Tier & Region
SELECT
device_tier,
geo_region,
quantile(0.95)(toFloat64(attributes['web.vitals.lcp'])) AS p95_lcp_ms,
count() AS session_count
FROM rum_spans
WHERE timestamp >= now() - INTERVAL 7 DAY
AND attributes['web.vitals.lcp'] IS NOT NULL
GROUP BY device_tier, geo_region
ORDER BY p95_lcp_ms DESC;
This structured approach ensures that OpenTelemetry for Web RUM delivers actionable, privacy-safe, and highly scalable performance insights across modern web applications.