Skip to content

Skip synchronized(unsafeTags) on owner-thread tag writes#11082

Draft
bm1549 wants to merge 3 commits intomasterfrom
brian.marks/thread-owned-tags
Draft

Skip synchronized(unsafeTags) on owner-thread tag writes#11082
bm1549 wants to merge 3 commits intomasterfrom
brian.marks/thread-owned-tags

Conversation

@bm1549
Copy link
Copy Markdown
Contributor

@bm1549 bm1549 commented Apr 10, 2026

What Does This Do

Optimizes DDSpanContext tag writes by skipping synchronized(unsafeTags) when the writing thread is the span's creating thread (the common case). A volatile tagWriteState field tracks whether the span is in owner-only mode or shared mode. Once any non-owner thread accesses tags or the span finishes, it transitions to shared mode permanently and all subsequent accesses take the lock.

Motivation

Every setTag() / setMetric() call acquires synchronized(unsafeTags) — ~27 lock sites total. Spans are almost always written by a single thread, so the lock is uncontended overhead (~20-30ns per acquire/release × 5-20 tags per span × millions of spans/second). This optimization eliminates that overhead on the fast path by replacing the lock with a volatile read + thread ID comparison.

Safety model:

  • volatile int tagWriteState tracks STATE_OWNER (0) vs STATE_SHARED (1). Once shared, never reverts.
  • Owner thread: volatile read + threadId check → skip lock
  • Non-owner thread: take lock + sticky-transition to STATE_SHARED
  • finish() calls transitionToShared() so post-finish writes (decorators/handlers) always take the lock
  • Long-running spans disable the optimization at construction since the writer thread may read tags on unfinished spans

Benchmark Results

JMH benchmarks, 2 forks × 5 warmup + 5 measurement iterations, back-to-back on same machine. Owner-thread benchmarks use @Threads(1), cross-thread uses @Threads(8).

JDK 21 (Zulu 21.0.1) — biased locking removed

Benchmark Baseline Optimized Unit Change
fullLifecycle_tenTags 2.288 ± 0.111 2.559 ± 0.091 ops/µs +11.8%
setStringTag_ownerThread 0.032 ± 0.002 0.037 ± 0.004 ops/ns +15.6%
setIntTag_ownerThread 0.030 ± 0.004 0.038 ± 0.017 ops/ns +26.7%
setTenTags_ownerThread 0.006 ± 0.001 0.008 ± 0.001 ops/ns +33.3%
setStringTag_crossThread (8T) 0.019 ± 0.003 0.017 ± 0.002 ops/ns ~neutral (within error)

JDK 8 (Zulu 8u372) — biased locking enabled

Benchmark Baseline Optimized Unit Change
fullLifecycle_tenTags 2.043 ± 0.071 2.132 ± 0.126 ops/µs +4.4%
setStringTag_ownerThread 0.026 ± 0.002 0.029 ± 0.002 ops/ns +11.5%
setIntTag_ownerThread 0.040 ± 0.016 0.035 ± 0.009 ops/ns ~neutral (within error)
setTenTags_ownerThread 0.007 ± 0.001 0.007 ± 0.001 ops/ns ~neutral
setStringTag_crossThread (8T) 0.022 ± 0.004 0.022 ± 0.005 ops/ns ~neutral (no regression)

Analysis

On JDK 21 (where biased locking was removed in JDK 15), uncontended synchronized is more expensive and the optimization shows clear gains: +12-33% across all owner-thread benchmarks. The full lifecycle benchmark (create + 10 tags + finish) shows +11.8% throughput improvement.

On JDK 8 (biased locking enabled), the JVM already optimizes uncontended locks, so gains are modest (+4-12%).

Cross-thread path: no regression on either JDK. The slow path (non-owner threads) takes the lock just like before, with one additional volatile read of tagWriteState.

Additional Notes

  • synchronized(unsafeTags) is kept on reader paths: processTagsAndBaggage, earlyProcessTags, getTags, toString
  • The TOCTOU race window for cross-thread writes is accepted — same best-effort as today, the existing codebase comments say "tags will rarely, if ever, be read and modified concurrently"
  • Added DDSpanContextConcurrencyTest (9 JUnit 5 tests including 3 targeted cross-thread stress tests) and SpanTagBenchmark (5 JMH benchmarks)

Contributor Checklist

Note: Once your PR is ready to merge, add it to the merge queue by commenting /merge. /merge -c cancels the queue request. /merge -f --reason "reason" skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.

Spans are almost always written by a single thread, so the lock on every
setTag/setMetric call is uncontended overhead. This adds a volatile
tagWriteState check: if the current thread is the span's creating thread
(STATE_OWNER), tag writes skip the lock entirely. Non-owner threads and
post-finish writes take the lock and sticky-transition to STATE_SHARED.

Long-running spans disable the optimization at construction since the
writer thread may read tags on unfinished spans.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@bm1549 bm1549 added type: enhancement Enhancements and improvements comp: core Tracer core tag: ai generated Largely based on code generated by an AI or LLM labels Apr 10, 2026
@bm1549 bm1549 changed the title Skip synchronized(unsafeTags) on owner-thread tag writes WIP - DO NOT REVIEW: Skip synchronized(unsafeTags) on owner-thread tag writes Apr 10, 2026
@bm1549 bm1549 marked this pull request as ready for review April 10, 2026 19:05
@bm1549 bm1549 requested a review from a team as a code owner April 10, 2026 19:05
@bm1549 bm1549 requested a review from mhlidd April 10, 2026 19:05
@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Apr 10, 2026

Benchmarks

⚠️ Warning: Baseline build not found for merge-base commit. Comparing against the latest commit on master instead.

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master brian.marks/thread-owned-tags
git_commit_date 1776110723 1776111607
git_commit_sha 9f89a0b d992fcd
release_version 1.62.0-SNAPSHOT~9f89a0b26c 1.62.0-SNAPSHOT~d992fcde3f
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1776113382 1776113382
ci_job_id 1591478952 1591478952
ci_pipeline_id 107470487 107470487
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-1-6rieplgt 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-1-6rieplgt 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 59 metrics, 12 unstable metrics.

Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.62.0-SNAPSHOT~d992fcde3f, baseline=1.62.0-SNAPSHOT~9f89a0b26c

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.064 s) : 0, 1063626
Total [baseline] (11.15 s) : 0, 11149749
Agent [candidate] (1.058 s) : 0, 1058169
Total [candidate] (11.112 s) : 0, 11112380
section appsec
Agent [baseline] (1.249 s) : 0, 1248992
Total [baseline] (11.083 s) : 0, 11083030
Agent [candidate] (1.249 s) : 0, 1248808
Total [candidate] (11.108 s) : 0, 11107753
section iast
Agent [baseline] (1.223 s) : 0, 1222869
Total [baseline] (11.32 s) : 0, 11319740
Agent [candidate] (1.225 s) : 0, 1225152
Total [candidate] (11.233 s) : 0, 11233249
section profiling
Agent [baseline] (1.186 s) : 0, 1185593
Total [baseline] (11.094 s) : 0, 11094126
Agent [candidate] (1.194 s) : 0, 1193971
Total [candidate] (11.257 s) : 0, 11257032
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.064 s -
Agent appsec 1.249 s 185.367 ms (17.4%)
Agent iast 1.223 s 159.243 ms (15.0%)
Agent profiling 1.186 s 121.967 ms (11.5%)
Total tracing 11.15 s -
Total appsec 11.083 s -66.719 ms (-0.6%)
Total iast 11.32 s 169.991 ms (1.5%)
Total profiling 11.094 s -55.623 ms (-0.5%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.058 s -
Agent appsec 1.249 s 190.639 ms (18.0%)
Agent iast 1.225 s 166.983 ms (15.8%)
Agent profiling 1.194 s 135.802 ms (12.8%)
Total tracing 11.112 s -
Total appsec 11.108 s -4.626 ms (-0.0%)
Total iast 11.233 s 120.869 ms (1.1%)
Total profiling 11.257 s 144.652 ms (1.3%)
gantt
    title petclinic - break down per module: candidate=1.62.0-SNAPSHOT~d992fcde3f, baseline=1.62.0-SNAPSHOT~9f89a0b26c

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.233 ms) : 0, 1233
crashtracking [candidate] (1.237 ms) : 0, 1237
BytebuddyAgent [baseline] (636.694 ms) : 0, 636694
BytebuddyAgent [candidate] (631.769 ms) : 0, 631769
AgentMeter [baseline] (29.688 ms) : 0, 29688
AgentMeter [candidate] (29.376 ms) : 0, 29376
GlobalTracer [baseline] (250.598 ms) : 0, 250598
GlobalTracer [candidate] (248.904 ms) : 0, 248904
AppSec [baseline] (32.14 ms) : 0, 32140
AppSec [candidate] (32.089 ms) : 0, 32089
Debugger [baseline] (60.709 ms) : 0, 60709
Debugger [candidate] (60.174 ms) : 0, 60174
Remote Config [baseline] (594.69 µs) : 0, 595
Remote Config [candidate] (612.52 µs) : 0, 613
Telemetry [baseline] (8.185 ms) : 0, 8185
Telemetry [candidate] (8.047 ms) : 0, 8047
Flare Poller [baseline] (7.339 ms) : 0, 7339
Flare Poller [candidate] (9.841 ms) : 0, 9841
section appsec
crashtracking [baseline] (1.235 ms) : 0, 1235
crashtracking [candidate] (1.226 ms) : 0, 1226
BytebuddyAgent [baseline] (662.661 ms) : 0, 662661
BytebuddyAgent [candidate] (661.537 ms) : 0, 661537
AgentMeter [baseline] (12.138 ms) : 0, 12138
AgentMeter [candidate] (12.087 ms) : 0, 12087
GlobalTracer [baseline] (249.781 ms) : 0, 249781
GlobalTracer [candidate] (249.2 ms) : 0, 249200
AppSec [baseline] (184.261 ms) : 0, 184261
AppSec [candidate] (184.903 ms) : 0, 184903
Debugger [baseline] (65.198 ms) : 0, 65198
Debugger [candidate] (66.196 ms) : 0, 66196
Remote Config [baseline] (601.372 µs) : 0, 601
Remote Config [candidate] (619.056 µs) : 0, 619
Telemetry [baseline] (8.629 ms) : 0, 8629
Telemetry [candidate] (8.497 ms) : 0, 8497
Flare Poller [baseline] (3.534 ms) : 0, 3534
Flare Poller [candidate] (3.553 ms) : 0, 3553
IAST [baseline] (24.596 ms) : 0, 24596
IAST [candidate] (24.604 ms) : 0, 24604
section iast
crashtracking [baseline] (1.216 ms) : 0, 1216
crashtracking [candidate] (1.26 ms) : 0, 1260
BytebuddyAgent [baseline] (799.954 ms) : 0, 799954
BytebuddyAgent [candidate] (801.193 ms) : 0, 801193
AgentMeter [baseline] (11.408 ms) : 0, 11408
AgentMeter [candidate] (11.485 ms) : 0, 11485
GlobalTracer [baseline] (239.19 ms) : 0, 239190
GlobalTracer [candidate] (239.464 ms) : 0, 239464
AppSec [baseline] (32.565 ms) : 0, 32565
AppSec [candidate] (32.11 ms) : 0, 32110
Debugger [baseline] (59.674 ms) : 0, 59674
Debugger [candidate] (62.489 ms) : 0, 62489
Remote Config [baseline] (536.868 µs) : 0, 537
Remote Config [candidate] (533.526 µs) : 0, 534
Telemetry [baseline] (12.698 ms) : 0, 12698
Telemetry [candidate] (11.153 ms) : 0, 11153
Flare Poller [baseline] (3.642 ms) : 0, 3642
Flare Poller [candidate] (3.402 ms) : 0, 3402
IAST [baseline] (25.743 ms) : 0, 25743
IAST [candidate] (25.866 ms) : 0, 25866
section profiling
ProfilingAgent [baseline] (94.002 ms) : 0, 94002
ProfilingAgent [candidate] (94.84 ms) : 0, 94840
crashtracking [baseline] (1.189 ms) : 0, 1189
crashtracking [candidate] (1.177 ms) : 0, 1177
BytebuddyAgent [baseline] (692.346 ms) : 0, 692346
BytebuddyAgent [candidate] (697.321 ms) : 0, 697321
AgentMeter [baseline] (9.083 ms) : 0, 9083
AgentMeter [candidate] (9.187 ms) : 0, 9187
GlobalTracer [baseline] (207.379 ms) : 0, 207379
GlobalTracer [candidate] (208.923 ms) : 0, 208923
AppSec [baseline] (32.455 ms) : 0, 32455
AppSec [candidate] (32.885 ms) : 0, 32885
Debugger [baseline] (65.678 ms) : 0, 65678
Debugger [candidate] (66.095 ms) : 0, 66095
Remote Config [baseline] (577.696 µs) : 0, 578
Remote Config [candidate] (586.359 µs) : 0, 586
Telemetry [baseline] (7.918 ms) : 0, 7918
Telemetry [candidate] (7.955 ms) : 0, 7955
Flare Poller [baseline] (3.548 ms) : 0, 3548
Flare Poller [candidate] (3.637 ms) : 0, 3637
Profiling [baseline] (94.566 ms) : 0, 94566
Profiling [candidate] (95.399 ms) : 0, 95399
Loading
Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.62.0-SNAPSHOT~d992fcde3f, baseline=1.62.0-SNAPSHOT~9f89a0b26c

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.064 s) : 0, 1063817
Total [baseline] (8.856 s) : 0, 8855987
Agent [candidate] (1.058 s) : 0, 1057690
Total [candidate] (8.828 s) : 0, 8828369
section iast
Agent [baseline] (1.228 s) : 0, 1228326
Total [baseline] (9.629 s) : 0, 9628876
Agent [candidate] (1.226 s) : 0, 1226260
Total [candidate] (9.533 s) : 0, 9533455
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.064 s -
Agent iast 1.228 s 164.508 ms (15.5%)
Total tracing 8.856 s -
Total iast 9.629 s 772.889 ms (8.7%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.058 s -
Agent iast 1.226 s 168.57 ms (15.9%)
Total tracing 8.828 s -
Total iast 9.533 s 705.085 ms (8.0%)
gantt
    title insecure-bank - break down per module: candidate=1.62.0-SNAPSHOT~d992fcde3f, baseline=1.62.0-SNAPSHOT~9f89a0b26c

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.242 ms) : 0, 1242
crashtracking [candidate] (1.221 ms) : 0, 1221
BytebuddyAgent [baseline] (635.894 ms) : 0, 635894
BytebuddyAgent [candidate] (631.889 ms) : 0, 631889
AgentMeter [baseline] (29.653 ms) : 0, 29653
AgentMeter [candidate] (29.351 ms) : 0, 29351
GlobalTracer [baseline] (249.912 ms) : 0, 249912
GlobalTracer [candidate] (248.617 ms) : 0, 248617
AppSec [baseline] (32.278 ms) : 0, 32278
AppSec [candidate] (32.065 ms) : 0, 32065
Debugger [baseline] (59.924 ms) : 0, 59924
Debugger [candidate] (59.253 ms) : 0, 59253
Remote Config [baseline] (599.887 µs) : 0, 600
Remote Config [candidate] (597.658 µs) : 0, 598
Telemetry [baseline] (8.186 ms) : 0, 8186
Telemetry [candidate] (8.112 ms) : 0, 8112
Flare Poller [baseline] (9.84 ms) : 0, 9840
Flare Poller [candidate] (10.413 ms) : 0, 10413
section iast
crashtracking [baseline] (1.221 ms) : 0, 1221
crashtracking [candidate] (1.225 ms) : 0, 1225
BytebuddyAgent [baseline] (803.161 ms) : 0, 803161
BytebuddyAgent [candidate] (802.11 ms) : 0, 802110
AgentMeter [baseline] (11.455 ms) : 0, 11455
AgentMeter [candidate] (11.36 ms) : 0, 11360
GlobalTracer [baseline] (240.165 ms) : 0, 240165
GlobalTracer [candidate] (240.293 ms) : 0, 240293
AppSec [baseline] (29.038 ms) : 0, 29038
AppSec [candidate] (31.078 ms) : 0, 31078
Debugger [baseline] (63.956 ms) : 0, 63956
Debugger [candidate] (59.715 ms) : 0, 59715
Remote Config [baseline] (543.287 µs) : 0, 543
Remote Config [candidate] (524.282 µs) : 0, 524
Telemetry [baseline] (12.585 ms) : 0, 12585
Telemetry [candidate] (12.952 ms) : 0, 12952
Flare Poller [baseline] (3.651 ms) : 0, 3651
Flare Poller [candidate] (3.62 ms) : 0, 3620
IAST [baseline] (26.02 ms) : 0, 26020
IAST [candidate] (26.695 ms) : 0, 26695
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master brian.marks/thread-owned-tags
git_commit_date 1776110723 1776111607
git_commit_sha 9f89a0b d992fcd
release_version 1.62.0-SNAPSHOT~9f89a0b26c 1.62.0-SNAPSHOT~d992fcde3f
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1776113855 1776113855
ci_job_id 1591478953 1591478953
ci_pipeline_id 107470487 107470487
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-4-ihuncgo1 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-4-ihuncgo1 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 1 performance improvements and 1 performance regressions! Performance is the same for 18 metrics, 16 unstable metrics.

scenario Δ mean agg_http_req_duration_p50 Δ mean agg_http_req_duration_p95 Δ mean throughput candidate mean agg_http_req_duration_p50 candidate mean agg_http_req_duration_p95 candidate mean throughput baseline mean agg_http_req_duration_p50 baseline mean agg_http_req_duration_p95 baseline mean throughput
scenario:load:petclinic:iast:high_load worse
[+0.796ms; +1.621ms] or [+4.526%; +9.218%]
same
[-13.037µs; +1595.382µs] or [-0.045%; +5.478%]
unstable
[-39.012op/s; +11.074op/s] or [-15.010%; +4.261%]
18.793ms 29.915ms 245.938op/s 17.585ms 29.124ms 259.906op/s
scenario:load:petclinic:appsec:high_load better
[-799.929µs; -387.507µs] or [-4.333%; -2.099%]
same
[-716.972µs; +417.648µs] or [-2.401%; +1.399%]
unstable
[-18.949op/s; +30.136op/s] or [-7.622%; +12.123%]
17.869ms 29.713ms 254.188op/s 18.463ms 29.863ms 248.594op/s
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~d992fcde3f, baseline=1.62.0-SNAPSHOT~9f89a0b26c
    dateFormat X
    axisFormat %s
section baseline
no_agent (18.852 ms) : 18658, 19046
.   : milestone, 18852,
appsec (18.777 ms) : 18591, 18963
.   : milestone, 18777,
code_origins (17.84 ms) : 17670, 18010
.   : milestone, 17840,
iast (17.957 ms) : 17776, 18138
.   : milestone, 17957,
profiling (18.569 ms) : 18388, 18750
.   : milestone, 18569,
tracing (17.793 ms) : 17617, 17969
.   : milestone, 17793,
section candidate
no_agent (19.119 ms) : 18925, 19314
.   : milestone, 19119,
appsec (18.358 ms) : 18169, 18546
.   : milestone, 18358,
code_origins (17.802 ms) : 17625, 17978
.   : milestone, 17802,
iast (18.972 ms) : 18783, 19162
.   : milestone, 18972,
profiling (18.1 ms) : 17922, 18279
.   : milestone, 18100,
tracing (17.921 ms) : 17750, 18093
.   : milestone, 17921,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 18.852 ms [18.658 ms, 19.046 ms] -
appsec 18.777 ms [18.591 ms, 18.963 ms] -74.33 µs (-0.4%)
code_origins 17.84 ms [17.67 ms, 18.01 ms] -1.012 ms (-5.4%)
iast 17.957 ms [17.776 ms, 18.138 ms] -894.782 µs (-4.7%)
profiling 18.569 ms [18.388 ms, 18.75 ms] -282.566 µs (-1.5%)
tracing 17.793 ms [17.617 ms, 17.969 ms] -1.059 ms (-5.6%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 19.119 ms [18.925 ms, 19.314 ms] -
appsec 18.358 ms [18.169 ms, 18.546 ms] -761.714 µs (-4.0%)
code_origins 17.802 ms [17.625 ms, 17.978 ms] -1.318 ms (-6.9%)
iast 18.972 ms [18.783 ms, 19.162 ms] -146.99 µs (-0.8%)
profiling 18.1 ms [17.922 ms, 18.279 ms] -1.019 ms (-5.3%)
tracing 17.921 ms [17.75 ms, 18.093 ms] -1.198 ms (-6.3%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~d992fcde3f, baseline=1.62.0-SNAPSHOT~9f89a0b26c
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.233 ms) : 1221, 1244
.   : milestone, 1233,
iast (3.208 ms) : 3168, 3249
.   : milestone, 3208,
iast_FULL (5.962 ms) : 5902, 6022
.   : milestone, 5962,
iast_GLOBAL (3.767 ms) : 3707, 3828
.   : milestone, 3767,
profiling (2.348 ms) : 2324, 2371
.   : milestone, 2348,
tracing (1.876 ms) : 1860, 1893
.   : milestone, 1876,
section candidate
no_agent (1.236 ms) : 1224, 1247
.   : milestone, 1236,
iast (3.294 ms) : 3244, 3344
.   : milestone, 3294,
iast_FULL (5.798 ms) : 5740, 5855
.   : milestone, 5798,
iast_GLOBAL (3.605 ms) : 3552, 3659
.   : milestone, 3605,
profiling (2.424 ms) : 2397, 2450
.   : milestone, 2424,
tracing (1.887 ms) : 1871, 1904
.   : milestone, 1887,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.233 ms [1.221 ms, 1.244 ms] -
iast 3.208 ms [3.168 ms, 3.249 ms] 1.976 ms (160.3%)
iast_FULL 5.962 ms [5.902 ms, 6.022 ms] 4.729 ms (383.6%)
iast_GLOBAL 3.767 ms [3.707 ms, 3.828 ms] 2.535 ms (205.6%)
profiling 2.348 ms [2.324 ms, 2.371 ms] 1.115 ms (90.4%)
tracing 1.876 ms [1.86 ms, 1.893 ms] 643.586 µs (52.2%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.236 ms [1.224 ms, 1.247 ms] -
iast 3.294 ms [3.244 ms, 3.344 ms] 2.058 ms (166.6%)
iast_FULL 5.798 ms [5.74 ms, 5.855 ms] 4.562 ms (369.2%)
iast_GLOBAL 3.605 ms [3.552 ms, 3.659 ms] 2.37 ms (191.8%)
profiling 2.424 ms [2.397 ms, 2.45 ms] 1.188 ms (96.2%)
tracing 1.887 ms [1.871 ms, 1.904 ms] 651.616 µs (52.7%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master brian.marks/thread-owned-tags
git_commit_date 1776110723 1776111607
git_commit_sha 9f89a0b d992fcd
release_version 1.62.0-SNAPSHOT~9f89a0b26c 1.62.0-SNAPSHOT~d992fcde3f
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1776113572 1776113572
ci_job_id 1591478954 1591478954
ci_pipeline_id 107470487 107470487
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-5-x1ij87wc 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-5-x1ij87wc 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics.

Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~d992fcde3f, baseline=1.62.0-SNAPSHOT~9f89a0b26c
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.449 s) : 15449000, 15449000
.   : milestone, 15449000,
appsec (14.95 s) : 14950000, 14950000
.   : milestone, 14950000,
iast (18.497 s) : 18497000, 18497000
.   : milestone, 18497000,
iast_GLOBAL (18.185 s) : 18185000, 18185000
.   : milestone, 18185000,
profiling (14.914 s) : 14914000, 14914000
.   : milestone, 14914000,
tracing (14.939 s) : 14939000, 14939000
.   : milestone, 14939000,
section candidate
no_agent (15.551 s) : 15551000, 15551000
.   : milestone, 15551000,
appsec (14.914 s) : 14914000, 14914000
.   : milestone, 14914000,
iast (18.49 s) : 18490000, 18490000
.   : milestone, 18490000,
iast_GLOBAL (18.343 s) : 18343000, 18343000
.   : milestone, 18343000,
profiling (14.94 s) : 14940000, 14940000
.   : milestone, 14940000,
tracing (14.886 s) : 14886000, 14886000
.   : milestone, 14886000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.449 s [15.449 s, 15.449 s] -
appsec 14.95 s [14.95 s, 14.95 s] -499.0 ms (-3.2%)
iast 18.497 s [18.497 s, 18.497 s] 3.048 s (19.7%)
iast_GLOBAL 18.185 s [18.185 s, 18.185 s] 2.736 s (17.7%)
profiling 14.914 s [14.914 s, 14.914 s] -535.0 ms (-3.5%)
tracing 14.939 s [14.939 s, 14.939 s] -510.0 ms (-3.3%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.551 s [15.551 s, 15.551 s] -
appsec 14.914 s [14.914 s, 14.914 s] -637.0 ms (-4.1%)
iast 18.49 s [18.49 s, 18.49 s] 2.939 s (18.9%)
iast_GLOBAL 18.343 s [18.343 s, 18.343 s] 2.792 s (18.0%)
profiling 14.94 s [14.94 s, 14.94 s] -611.0 ms (-3.9%)
tracing 14.886 s [14.886 s, 14.886 s] -665.0 ms (-4.3%)
Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~d992fcde3f, baseline=1.62.0-SNAPSHOT~9f89a0b26c
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.497 ms) : 1485, 1509
.   : milestone, 1497,
appsec (3.879 ms) : 3656, 4102
.   : milestone, 3879,
iast (2.291 ms) : 2222, 2361
.   : milestone, 2291,
iast_GLOBAL (2.33 ms) : 2260, 2399
.   : milestone, 2330,
profiling (2.1 ms) : 2046, 2155
.   : milestone, 2100,
tracing (2.101 ms) : 2047, 2155
.   : milestone, 2101,
section candidate
no_agent (1.495 ms) : 1483, 1506
.   : milestone, 1495,
appsec (3.794 ms) : 3574, 4013
.   : milestone, 3794,
iast (2.285 ms) : 2216, 2354
.   : milestone, 2285,
iast_GLOBAL (2.325 ms) : 2255, 2395
.   : milestone, 2325,
profiling (2.115 ms) : 2060, 2170
.   : milestone, 2115,
tracing (2.095 ms) : 2042, 2149
.   : milestone, 2095,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.497 ms [1.485 ms, 1.509 ms] -
appsec 3.879 ms [3.656 ms, 4.102 ms] 2.382 ms (159.1%)
iast 2.291 ms [2.222 ms, 2.361 ms] 794.201 µs (53.0%)
iast_GLOBAL 2.33 ms [2.26 ms, 2.399 ms] 832.383 µs (55.6%)
profiling 2.1 ms [2.046 ms, 2.155 ms] 603.312 µs (40.3%)
tracing 2.101 ms [2.047 ms, 2.155 ms] 603.798 µs (40.3%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.495 ms [1.483 ms, 1.506 ms] -
appsec 3.794 ms [3.574 ms, 4.013 ms] 2.299 ms (153.8%)
iast 2.285 ms [2.216 ms, 2.354 ms] 790.26 µs (52.9%)
iast_GLOBAL 2.325 ms [2.255 ms, 2.395 ms] 830.077 µs (55.5%)
profiling 2.115 ms [2.06 ms, 2.17 ms] 620.397 µs (41.5%)
tracing 2.095 ms [2.042 ms, 2.149 ms] 600.427 µs (40.2%)

@bm1549 bm1549 marked this pull request as draft April 10, 2026 19:23
@bm1549 bm1549 changed the title WIP - DO NOT REVIEW: Skip synchronized(unsafeTags) on owner-thread tag writes Skip synchronized(unsafeTags) on owner-thread tag writes Apr 10, 2026
Add three targeted concurrency tests that exercise the exact cross-thread
tag write pattern the JMH crossThread benchmark was measuring:
- crossThreadSustainedNoCrash: 8 threads × 10k setTag on same span
- ownerToSharedTransition: owner writes first, then 8 threads join
- manySpansCrossThread: 10k short-lived spans tagged from 8 threads

All pass, proving the production code handles cross-thread writes without
NPE or structural corruption.

Fix the crossThread benchmark: change SharedSpan @setup from
Level.Invocation to Level.Iteration. With Level.Invocation, 8 threads
raced to call setup() concurrently, causing NPE when state.span was
transiently null between invocations — a benchmark harness bug, not a
production code bug.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@dougqh dougqh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at throughput numbers, there's a modest gain that might make this change worth the complexity if we can simplify it a bit.

The two main changes that I think we should try are...

  • replacing the two volatiles with a single volatile for the owningThread
  • introducing high-order helper functions that hide the locking complexity

Then all the duplicate code can go away, and the accessing code becomes something like...
accessTags(unsafeTags -> {
...
});
We just need to make sure not to introduce a capturing lambda, so we don't incur unneeded allocation.

@dougqh
Copy link
Copy Markdown
Contributor

dougqh commented Apr 13, 2026

Looking at throughput numbers, there's a modest gain that might make this change worth the complexity if we can simplify it a bit.

The two main changes that I think we should try are...

  • replacing the two volatiles with a single volatile for the owningThread
  • introducing high-order helper functions that hide the locking complexity

Then all the duplicate code can go away, and the accessing code becomes something like... accessTags(unsafeTags -> { ... }); We just need to make sure not to introduce a capturing lambda, so we don't incur unneeded allocation.

Alas, creating a higher-order helper function in DDSpanContext proved more annoying than anticipated.
Or at least, it is hard to do so while also avoiding variable closure than I expected.

I'm back to thinking if we're going to do this change, we should do it in OptimizedTagMap. At least in OptimizedTagMap, most methods are sugar around a few methods that just work with a single TagMap.Entry.

Per review feedback, move the lock-skipping optimization from
DDSpanContext into OptimizedTagMap itself. This keeps the optimization
invisible to callers — DDSpanContext no longer needs synchronized blocks
around tag operations, and developers adding new tag operations don't
need to think about locking.

OptimizedTagMap now has a volatile Thread ownerThread field. Core methods
(getAndSet, getAndRemove, getEntry, putAll, forEach, copy, etc.) check
ownership: owner thread skips the lock, non-owner threads synchronize
and permanently transition to shared mode.

DDSpanContext changes: removed all 27 synchronized(unsafeTags) blocks,
added setOwnerThread(current) in constructor, transitionToShared()
delegates to TagMap.

Also adds @threads(8) JMH benchmark variants and 5 new concurrency
tests (mixed read/write, fuzz, value consistency, finish race,
concurrent metrics).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp: core Tracer core tag: ai generated Largely based on code generated by an AI or LLM type: enhancement Enhancements and improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants