Move client-side stats computation off the span-finish thread#11117
Draft
Move client-side stats computation off the span-finish thread#11117
Conversation
ConflatingMetricsAggregator.publish() was consuming ~17% of foreground CPU (ConcurrentHashMap 12%, TraceHealthMetrics 3%, LongAdder 2%) by running MetricKey construction, ConcurrentHashMap lookups, and Batch management synchronously on the span-finish thread. This change extracts lightweight SpanStatsData DTOs on the foreground thread and defers all expensive work (MetricKey construction, map lookups, health metrics) to the existing background Aggregator thread via the MPSC inbox queue. The pending/keys maps are downgraded from ConcurrentHashMap to plain HashMap since they are now single-threaded. Benchmark shows 64-span trace foreground cost reduced from 2.86us to 2.04us (~29% reduction). tag: no release note tag: ai generated Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
SpanStatsData/TraceStatsDataDTOs that flow through the MPSC inbox queuependingandkeysfromConcurrentHashMapto plainHashMap(now single-threaded)SpanFinishWithStatsBenchmarkJMH benchmarkMotivation:
ConflatingMetricsAggregator.publish()consumed ~17% of foreground CPU in a 16-thread span creation stress test — 12% fromConcurrentHashMap.get()for MetricKey lookups, 3% fromTraceHealthMetrics.onClientStatTraceComputed()LongAdder increments, and 2% from additionalLongAdder.add()calls. All of this ran synchronously on the thread that calledspan.finish().Benchmark results
64-span foreground cost: 2.86us → 2.04us (~29% reduction)
Test plan
*ConflatingMetric*tests pass*Aggregator*tests pass🤖 Generated with Claude Code