Move client-side stats computation off the span-finish thread by dougqh · Pull Request #11117 · DataDog/dd-trace-java

dougqh · 2026-04-15T00:38:48Z

Summary

Moves expensive MetricKey construction, ConcurrentHashMap operations, Batch management, and health metrics off the span-finish thread to the existing background Aggregator thread
Introduces lightweight SpanStatsData / TraceStatsData DTOs that flow through the MPSC inbox queue
Downgrades pending and keys from ConcurrentHashMap to plain HashMap (now single-threaded)
Includes SpanFinishWithStatsBenchmark JMH benchmark

Motivation: ConflatingMetricsAggregator.publish() consumed ~17% of foreground CPU in a 16-thread span creation stress test — 12% from ConcurrentHashMap.get() for MetricKey lookups, 3% from TraceHealthMetrics.onClientStatTraceComputed() LongAdder increments, and 2% from additional LongAdder.add() calls. All of this ran synchronously on the thread that called span.finish().

Benchmark results

Benchmark	Score	Units
publishSmallTrace (4 spans)	0.159 ± 0.006	us/op
publishMediumTrace (16 spans)	0.544 ± 0.007	us/op
publishLargeTrace (64 spans)	2.040 ± 0.014	us/op
publishConcurrent (8 threads)	1.851 ± 0.069	ops/us
OLD baseline (64 spans)	2.860 ± 0.013	us/op

64-span foreground cost: 2.86us → 2.04us (~29% reduction)

Test plan

All *ConflatingMetric* tests pass
All *Aggregator* tests pass
Run full CI suite
Verify with span creation stress test profiling

🤖 Generated with Claude Code

ConflatingMetricsAggregator.publish() was consuming ~17% of foreground CPU (ConcurrentHashMap 12%, TraceHealthMetrics 3%, LongAdder 2%) by running MetricKey construction, ConcurrentHashMap lookups, and Batch management synchronously on the span-finish thread. This change extracts lightweight SpanStatsData DTOs on the foreground thread and defers all expensive work (MetricKey construction, map lookups, health metrics) to the existing background Aggregator thread via the MPSC inbox queue. The pending/keys maps are downgraded from ConcurrentHashMap to plain HashMap since they are now single-threaded. Benchmark shows 64-span trace foreground cost reduced from 2.86us to 2.04us (~29% reduction). tag: no release note tag: ai generated Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move client-side stats computation off the span-finish thread#11117

Move client-side stats computation off the span-finish thread#11117
dougqh wants to merge 1 commit intomasterfrom
dougqh/stats-off-foreground-thread

dougqh commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dougqh commented Apr 15, 2026

Summary

Benchmark results

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant