Skip to content

[no-ci] CI: Add restricted-paths-guard.yml#1878

Open
rwgk wants to merge 19 commits intoNVIDIA:mainfrom
rwgk:pr-author-org-check
Open

[no-ci] CI: Add restricted-paths-guard.yml#1878
rwgk wants to merge 19 commits intoNVIDIA:mainfrom
rwgk:pr-author-org-check

Conversation

@rwgk
Copy link
Copy Markdown
Collaborator

@rwgk rwgk commented Apr 7, 2026

Supersedes (and reuses work from) #1871

Closes #1139

Summary

This PR adds a separate workflow, restricted-paths-guard.yml, for pull requests that touch restricted paths under cuda_bindings/ or cuda_python/.

The new restricted-paths-guard.yml workflow does not implement strict enforcement. Instead, if a PR touches one of the restricted paths and we do not have a trusted signal for the PR author, the workflow assigns the Needs-Restricted-Paths-Review label for manual follow-up. The job summary shows the author login, the author association, the matched restricted paths, the trusted signal, if any, that was used, and whether the label was needed.

Trusted Signals

The current implementation treats the following as trusted signals (these are tightly controlled at the org level):

  • github.event.pull_request.author_association is MEMBER, OWNER, or COLLABORATOR

Note: This workflow currently trusts the PR author; it does not check the full provenance of every commit on the PR branch. In practice, that means a trusted signal also carries the expectation that the cuda-python maintainer who merges the PR is careful not to merge external-user commits touching cuda_bindings/ or cuda_python/.

Intended Workflow Beyond This PR

The intended direction, in a follow-on PR, is for Needs-Restricted-Paths-Review to become a merge-blocking mechanism for inconclusive cases.

The flow would then be:

  • the workflow detects that a PR touches cuda_bindings/ or cuda_python/
  • if no trusted signal is available, the workflow applies Needs-Restricted-Paths-Review
  • that label blocks merge, together with a clear message explaining that restricted-path review by the maintainer is required before merging
  • the maintainer who merges the PR reviews the restricted-path changes specifically for the policy
  • if the PR is acceptable, they remove the Needs-Restricted-Paths-Review label manually and merge

In other words, the workflow should automatically clear the easy, common cases and route the remaining ones into an explicit manual review step.

Testing

See commit 02edff4: the pull_request trigger and a dummy change under cuda_bindings/ were used for testing. This was undone in the commit.

See below: Example Summary view of a workflow run (screenshot)

Report restricted-path author gating as its own workflow so contributors see the organization requirement as soon as a PR touches cuda_bindings/ or cuda_python/.

Made-with: Cursor
@rwgk rwgk added this to the cuda.bindings next milestone Apr 7, 2026
@rwgk rwgk self-assigned this Apr 7, 2026
@rwgk rwgk added P1 Medium priority - Should do CI/CD CI/CD infrastructure github_actions Pull requests that update GitHub Actions code labels Apr 7, 2026
@copy-pr-bot
Copy link
Copy Markdown
Contributor

copy-pr-bot bot commented Apr 7, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

rwgk added 3 commits April 7, 2026 15:27
Keep the lightweight PR check runnable on GitHub-hosted runners by avoiding the unsupported gh api --slurp --jq combination when scanning changed files.

Made-with: Cursor
Include the exact restricted-path matches in the job summary so contributors and reviewers can see what triggered the author organization check.

Made-with: Cursor
@rwgk
Copy link
Copy Markdown
Collaborator Author

rwgk commented Apr 7, 2026

I dug into whether we can determine conclusively from GitHub Actions if a PR author is a member of the NVIDIA GitHub organization.

  • github.event.pull_request.author_association is the built-in signal available directly in the PR payload. In this context:
    • MEMBER means the author is a member of the organization that owns the repository.
    • COLLABORATOR is a different bucket: someone with repository-level collaboration access, potentially an outside collaborator.
    • OWNER is the repository/organization owner bucket.
  • GitHub does expose organization-membership APIs that are more conclusive:
    • GET /orgs/NVIDIA/members/{username} returns 204 if the user is an org member and 404 if not, but only when the requester is itself an org member; otherwise it can return 302.
    • GET /orgs/NVIDIA/memberships/{username} returns membership details such as state and role, but also requires the requester to be an organization member.
  • The practical issue is credentials. The default workflow GITHUB_TOKEN is repository-scoped; it is not something I would rely on for conclusive organization-membership reads. Without introducing a separate PAT or GitHub App token with org-readable permissions, author_association is the only built-in signal readily available in Actions.
  • So the tradeoff looks like this:
    • no extra credential: use author_association, where MEMBER is the closest built-in proxy for "NVIDIA org member"
    • conclusive org-membership check: add a dedicated org-aware token and query the organization-membership API from a safe workflow context, most likely pull_request_target, without checking out or executing PR code

So COLLABORATOR is a policy decision, not a technical synonym for MEMBER. If we want to allow repo collaborators, that should be an explicit allowlist choice rather than something implied by GitHub's membership model.

rwgk added 2 commits April 7, 2026 16:40
Query NVIDIA org membership directly for restricted-path PRs so the workflow can distinguish members, non-members, and inconclusive cases, label PRs that need manual review, and show the matched files. Keep the trigger on pull_request temporarily so the new check can be exercised before switching back to pull_request_target.

Made-with: Cursor
@rwgk
Copy link
Copy Markdown
Collaborator Author

rwgk commented Apr 8, 2026

Reverting 42ba3f8 because the direct GET /orgs/NVIDIA/members/{username} experiment did not give us meaningful org-membership information with the default workflow token.

What we learned from the experiments:

  • The restricted-path detection itself works fine. The workflow can reliably detect changes under cuda_bindings/ / cuda_python/, including rename cases, and the summary output is useful.
  • The problem is specifically the org-membership query. Even for @rwgk, who is definitely in the NVIDIA GitHub org, the workflow did not get a conclusive "member" result from GET /orgs/NVIDIA/members/{username} when using the workflow GITHUB_TOKEN.
  • In other words, with the credentials we currently have in Actions, this query is not a trustworthy source of truth for "is the PR author in the NVIDIA org?".
  • I also did a systematic pass over .github/ and ci/ to see if we already had any more privileged PAT / GitHub App credentials we could reuse. We do not: everything there is using the standard github.token / secrets.GITHUB_TOKEN.
  • The label-based "inconclusive" fallback was secondary. It also ran into permission limits in the temporary pull_request test context, but the more important issue is that the membership query itself is not giving actionable answers with the default token.

Given the current goal of finding low-hanging fruit without introducing a dedicated PAT / GitHub App token, I think the direct org-membership query is not worth keeping in this PR.

If we revisit this later, the realistic options seem to be:

  • add a dedicated org-readable credential and do a true org-membership check, or
  • stay with a simpler policy/heuristic that does not depend on conclusive org-membership reads from Actions.

rwgk added 5 commits April 7, 2026 17:07
Treat repository collaborators as trusted for restricted-path changes so the workflow blocks only authors with no collaborator, member, or owner association.

Made-with: Cursor
Use OWNER and MEMBER associations plus public NVIDIA membership as true-positive signals, and add a review label instead of failing when the workflow cannot confirm the author automatically.

Made-with: Cursor
Replace the public-membership probe with a checked-in allowlist so restricted-path PRs can still get a known true positive during rollout while labeling authors that need manual review.

Made-with: Cursor
@rwgk
Copy link
Copy Markdown
Collaborator Author

rwgk commented Apr 8, 2026

We explored a few different ways to get a trustworthy automatic "yes" for PR authors who touch cuda_bindings/ / cuda_python/, and eventually settled on the current checked-in INTERNAL_AUTHOR_ALLOWLIST approach (commit 539461f) as the least bad rollout option for now.

  • github.event.pull_request.author_association was not sufficient by itself. In practice, COLLABORATOR is a repository-level relationship, not a reliable synonym for "external", and it does not cleanly tell us whether the PR author is in the NVIDIA GitHub org.
  • We also revisited the org-membership APIs (GET /orgs/{org}/members/{username} and the related membership endpoints). Those are attractive in theory, but with the credentials available in this workflow they did not give us a dependable organization-membership signal that we were comfortable enforcing on.
  • We checked whether public GitHub profile data could help, for example by looking at public organizations on https://github.com/{username} or via the public-org APIs. That only exposes public memberships, so it is not a reliable source of truth for private org membership.
  • We then tried the narrower endpoint GET /orgs/NVIDIA/public_members/{username} as a possible true-positive-only signal. Technically it behaves well (204 means yes, 404 means not a public member), but for @rwgk it was not a useful positive signal, so it would still leave many legitimate internal contributors in the "manual review needed" bucket.

Given those constraints, the current PR state is a pragmatic compromise:

  • keep MEMBER and OWNER as built-in true-positive signals
  • add a checked-in INTERNAL_AUTHOR_ALLOWLIST for known internal users during rollout (currently just rwgk)
  • if a PR touches cuda_bindings/ / cuda_python/ and we do not have a true-positive signal, assign the Check-PR-author-ORG label

So the reason we ended up here is that the other approaches were either too lossy (author_association), too dependent on private/public visibility (public_members / profile data), or not dependable enough with the credentials available in the workflow.

@rwgk
Copy link
Copy Markdown
Collaborator Author

rwgk commented Apr 8, 2026

Archiving OUTDATED PR description posted originally 2026-04-07 03:04 PM PDT:


Main open policy question

  • Is it acceptable for a COLLABORATOR to modify files under cuda_bindings/ or cuda_python/?
  • The current implementation says yes.
  • If the answer should instead be no, the workflow can be changed back to allow only MEMBER|OWNER.

Current state

  • This PR adds a separate workflow, pr-author-org-check.yml, so the restricted-path author check shows up as its own CI line item.
  • The workflow currently watches changes under cuda_bindings/ and cuda_python/.
  • It uses github.event.pull_request.author_association as the only author signal.
  • The current allowlist is:
    • COLLABORATOR
    • MEMBER
    • OWNER
  • If a PR touches one of the restricted paths and the author association is not in that allowlist, the workflow fails.
  • The job summary shows:
    • the author login
    • the author association
    • whether restricted paths were touched
    • the exact matched file(s) that triggered the check

What was tried and backed out

  • I also explored a more ambitious version that queried GET /orgs/NVIDIA/members/{username} directly.
  • In practice, with the default workflow token, that did not give trustworthy org-membership information even for known NVIDIA org members, so that approach was reverted.
  • I also checked systematically through .github/ and ci/ for an existing stronger PAT / GitHub App credential we could reuse, and did not find one.

What is already working

  • Pass when the PR does not touch cuda_bindings/ or cuda_python/.
  • Fail when a restricted path is touched by a non-allowed association.
  • Pass when a restricted path is touched by an allowed association.
  • The summary output is useful and shows the matched restricted file(s).

Notes before merge

  • The trigger is intentionally still pull_request right now so the workflow can be exercised from this PR branch. Before marking the PR ready for review / merging, change the trigger back to pull_request_target.

  • .github/workflows/pr-author-org-check.yml:76 now allows COLLABORATOR, but the failure text at .github/workflows/pr-author-org-check.yml:92 still says “Only NVIDIA organization members may modify...”. That message is no longer accurate and will misstate the actual policy to contributors and reviewers.

  • .github/workflows/pr-author-org-check.yml:39 matches files via both filename and previous_filename, but the emitted summary list only prints .filename at .github/workflows/pr-author-org-check.yml:45. A rename out of cuda_bindings/ or cuda_python/ will still trigger correctly, but the summary can show only the destination path and make the match look surprising.

rwgk added 2 commits April 8, 2026 13:14
Replace the remaining "true positive" terminology with "trusted signal" so the workflow summary and implementation language match the current manual-review design.

Made-with: Cursor
Show both the previous and current path in the PR author check summary so rename-triggered matches are understandable during manual review.

Made-with: Cursor
rwgk added 5 commits April 9, 2026 14:46
Remove the temporary checked-in allowlist and treat COLLABORATOR as a trusted author signal again so the workflow matches the team policy.

Made-with: Cursor
Use shorter workflow and job names so the GitHub checks list is easier to scan while keeping the manual-review intent clear.

Made-with: Cursor
Use a maintainer-facing label name that better describes the follow-up action required before merging inconclusive restricted-path changes.

Made-with: Cursor
Use workflow, job, and summary names that match the current restricted-paths guard behavior and read more clearly in the GitHub checks UI.

Made-with: Cursor
@rwgk rwgk changed the title [no-ci] CI: Add pr-author-org-check.yml [no-ci] CI: Add restricted-paths-guard.yml Apr 9, 2026
- undo cuda_bindings/pyproject.toml # XXX DUMMY CHANGE XXX
- change trigger from pull_request (useful only for testing this PR) to pull_request_target
@rwgk
Copy link
Copy Markdown
Collaborator Author

rwgk commented Apr 10, 2026

Example Summary view of a workflow run:

Screenshot 2026-04-09 at 21 42 00

@rwgk rwgk force-pushed the pr-author-org-check branch from 7fb4107 to 02edff4 Compare April 10, 2026 04:45
@rwgk rwgk added P0 High priority - Must do! and removed P1 Medium priority - Should do labels Apr 10, 2026
@leofang leofang removed the github_actions Pull requests that update GitHub Actions code label Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI/CD CI/CD infrastructure P0 High priority - Must do!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CI: Autogenerate a PR comment for contributors making changes to the cuda_bindings/ directory

3 participants