Skip to content

fix: auto-weight grouped rubrics shorthand by criteria count#1099

Draft
christso wants to merge 4 commits intomainfrom
fix/1098-rubrics-shorthand-auto-weight
Draft

fix: auto-weight grouped rubrics shorthand by criteria count#1099
christso wants to merge 4 commits intomainfrom
fix/1098-rubrics-shorthand-auto-weight

Conversation

@christso
Copy link
Copy Markdown
Collaborator

Closes #1098

Problem

When string shorthand assertions are mixed with explicit graders, the internal grouping creates a hidden weight asymmetry. A user who writes 4 assertions expects equal weight per line:

assertions:
  - Identifies the undefined access   # user thinks: 1/4 weight
  - Suggests a null-safe fix           # user thinks: 1/4 weight
  - Explains the root cause            # user thinks: 1/4 weight
  - type: contains
    value: "null"                      # user thinks: 1/4 weight

But the framework creates 2 graders — rubrics (weight 1) and contains (weight 1) — so contains got 50% of the score instead of 25%.

Fix

One line change in the string shorthand grouping logic (evaluator-parser.ts): when string criteria are grouped into a rubrics grader, set its weight = criteria.length. This makes each user-visible assertion contribute equal weight regardless of how many strings are grouped together.

Before: rubrics(w=1) + contains(w=1) → 50/50
After: rubrics(w=3) + contains(w=1) → 75/25 (each of 4 lines = 25%)

Behaviour

  • Mixed assertions: rubrics weight scales with criteria count ✓
  • All-string assertions: weight is set but has no effect (sole grader) ✓
  • Explicit type: rubrics with weight: set: unaffected (different code path) ✓
  • Explicit weight: on string shorthand: not possible by design — use type: rubrics for that

🤖 Generated with Claude Code

christso and others added 2 commits April 14, 2026 07:40
When string shorthand assertions are mixed with other explicit graders,
the rubrics grader created from the strings now gets weight = number of
criteria, making each user-visible assertion contribute equal weight to
the overall score.

Before: [contains, "A", "B", "C"] → contains(w=1) + rubrics(w=1) → 50/50
After:  [contains, "A", "B", "C"] → contains(w=1) + rubrics(w=3) → 25/75

The shorthand abstraction is now transparent — users who write N string
criteria alongside M explicit graders get equal weight per visible line,
without needing to know about internal grader grouping.

Closes #1098

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Apr 14, 2026

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: 20efd94
Status: ✅  Deploy successful!
Preview URL: https://16a9af3e.agentv.pages.dev
Branch Preview URL: https://fix-1098-rubrics-shorthand-a.agentv.pages.dev

View logs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: auto-weight grouped rubrics shorthand by criteria count

1 participant