feat(webapp): consolidate auth path + add comprehensive auth tests#3499
feat(webapp): consolidate auth path + add comprehensive auth tests#3499matt-aitken wants to merge 80 commits intomainfrom
Conversation
🦋 Changeset detectedLatest commit: 5063f84 The changes in this PR will be included in the next version bump. This PR includes changesets to release 31 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a new RBAC system and plugin surface: a Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
|
…Client from public interface - Replace buildBearerAbility/buildSessionAbility with authenticateBearer/authenticateSession - Add RbacEnvironment, RbacUser, BearerAuthResult, SessionAuthResult types - Remove PrismaClient from @trigger.dev/plugins interface (no Prisma crossing repo boundary) - Remove @trigger.dev/database dependency and api-extractor from plugins package - Switch plugins build to tsup --dts, delete api-extractor.json and tsconfig.dts.json - OSS fallback imports PrismaClient from @trigger.dev/database directly - OSS loader passes helpers-only to enterprise plugin, (prisma, helpers) to fallback - Add rbac.server.ts singleton to webapp - PoC: migrate admin.concurrency route to rbac.authenticateSession + canSuper()
…wJWT option, buildJwtAbility
Adds a `forceFallback` option to the RBAC loader so tests (and any other consumer that sets RBAC_FORCE_FALLBACK=1) pin authentication to the OSS fallback regardless of whether the enterprise plugin is installed. - internal-packages/rbac: LazyController and RoleBaseAccess.create now accept RbacCreateOptions.forceFallback. When true, load() skips the dynamic import of @triggerdotdev/plugins/rbac and constructs RoleBaseAccessFallback directly. - apps/webapp env: new RBAC_FORCE_FALLBACK BoolEnv, threaded into app/services/rbac.server.ts. - testcontainers webapp: set RBAC_FORCE_FALLBACK=1 so e2e tests exercise the fallback deterministically. - api-auth.e2e.test.ts: covers the fallback wiring end-to-end via the existing /admin/concurrency route, which already uses rbac.authenticateSession + ability.canSuper().
Close the coverage gap identified in the TRI-8716 audit before TRI-8719 swaps apiBuilder.server.ts to rbac.authenticateBearer. All new tests run against the legacy authenticateApiRequestWithFailure / authenticateApiRequestWithPersonalAccessToken path and must stay green after the migration. - Action requests (createActionApiRoute) via POST /api/v1/idempotencyKeys/:key/reset: * Valid private API key → passes auth (400 on zod body validation, not 401/403). * Missing Authorization → 401. * Invalid API key → 401. - JWT on the same action route (allowJWT: true, superScopes write:runs, admin): * JWT with matching scope → passes auth. * JWT with read-only scope → 403. - Personal access tokens (createLoaderPATApiRoute) via GET /api/v1/projects/:ref/runs: * Missing Authorization → 401. * API key (tr_dev_*) on PAT-only route → 401. * Wrong-prefix or malformed PAT → 401. * Well-formed but unknown PAT → 401. * Revoked PAT → 401. * Valid PAT on unknown project → 404 (auth passes). - Edge case: valid API key whose project.deletedAt is set → 401. Also fix the TRI-8715 redirect assertion: the webapp sends clients to /login?redirectTo=... so compare by pathname rather than exact string. New helper test/helpers/seedTestPAT.ts seeds a User + PersonalAccessToken row using the same hashing/encryption scheme the webapp uses (shared test ENCRYPTION_KEY), so the webapp subprocess can authenticate against the seeded token. otu and realtime.skipColumns propagation are deferred: they're only observable via real trigger / realtime-stream flows, which need workers/streams enabled and are out of scope for a coverage PR. The migration preserves those fields via BearerAuthResult.jwt; dedicated coverage can ride with TRI-8719 if needed.
Close the resource-scoped JWT coverage gap before TRI-8719 swaps
apiBuilder to rbac.authenticateBearer. Target:
POST /api/v1/waitpoints/tokens/:waitpointFriendlyId/complete — allowJWT,
resource: { waitpoints: params.waitpointFriendlyId }, superScopes:
[write:waitpoints, admin].
New helper test/helpers/seedTestWaitpoint.ts seeds a Waitpoint in
COMPLETED status so the handler short-circuits once auth passes, keeping
the 200 assertion independent of run-engine workers.
7 new tests exercise the legacy checkAuthorization scope algebra that
the migration must preserve:
- scope matches exact resource id → 200
- scope targets a different id of the same type → 403
- type-level scope (no id) grants all resources of that type → 200
- read-only scope on a write route → 403
- scope targets a different resource type → 403
- admin super-scope → 200 (legacy super-scope listing)
- unrelated type scope with no super-scope match → 403
Without these, the only JWT coverage was coarse type-level allow/deny
against routes whose resource callbacks returned () => 1 or () => ({}),
leaving resource-id matching entirely untested end-to-end.
Lock in the legacy checkAuthorization behaviours that TRI-8719 must
preserve once it swaps in rbac.authenticateBearer + ability.can.
Three tests in a new describe block 'JWT bearer auth — behaviours to
preserve through TRI-8719':
- Custom action route: type-level write:tasks JWT on
POST /api/v1/tasks/:taskId/trigger (action: trigger) → auth passes
today via exact superScope match. Must keep passing after TRI-8719
via the ACTION_ALIASES map (trigger ← write).
- Multi-key resource: read:tags:<tag> JWT on /api/v1/runs/:runId/trace
where the seeded run has that tag → auth passes today because
legacy checks each resource key. Must keep passing after TRI-8719
via ability.can's array-resource form.
- Multi-key resource: read:batch:<friendlyId> JWT on
/api/v1/runs/:runId/trace where the seeded run is in that batch →
same rationale as the tags case.
Dropped the planned empty-resource test: researching it surfaced that
legacy checkAuthorization denies empty-resource requests BEFORE the
super-scope check runs, so api.v1.batches.ts and idempotencyKeys reset
currently reject all JWTs despite allowJWT: true. TRI-8719's plan
(adding explicit { type: 'runs' }) is an intentional improvement, not
a preservation — documented in the TRI-8719 description comment.
New helper test/helpers/seedTestRun.ts seeds a minimal TaskRun (and,
optionally, an associated BatchTaskRun) that ApiRetrieveRunPresenter's
findRun can resolve for multi-key resource tests. The tests only
assert 'auth passes' (!== 401, !== 403) — the handler's downstream
behaviour (which may fail in a worker-less test env) isn't relevant
to the auth-layer contract.
Foundational changes before swapping apiBuilder to rbac.authenticateBearer. No behaviour change yet — apiBuilder is still on the legacy path. Array resources: - @trigger.dev/plugins RbacAbility.can now accepts RbacResource | RbacResource[]. Array form means 'grant access if any element passes', preserving the legacy checkAuthorization multi-key semantic once TRI-8719 completes. - internal-packages/rbac ability.ts: permissive/super/deny pass through unchanged; buildJwtAbility iterates the array and short-circuits on first match. Action alias wrapper (internal-packages/rbac/src/index.ts): - ACTION_ALIASES map + withActionAliases function. Wraps an underlying RbacAbility so that can(action, resource) retries with alias actions when the direct check fails. Currently: trigger, batchTrigger, update are all satisfied by a scope whose action is write — matching legacy superScope behaviour for route.action values that don't align with scope prefixes. - LazyController wraps the ability it gets from authenticateBearer / authenticateSession. authenticateAuthorize* stop delegating to the underlying's own Authorize methods (that would bypass the wrapper) and instead do the inline ability.can check against the wrapped ability. The enterprise plugin (TRI-8720) does not need to know about aliases — the wrapper applies uniformly regardless of which ability came back. Tests: - ability.test.ts: +4 tests for array resource form (31 total in file). - loader.test.ts: +11 tests for withActionAliases (direct match, alias retry for trigger/batchTrigger/update, id-scoped retry, admin passes, array form retry, canSuper delegation). - Unit suite: 31 tests, all passing. - Webapp typecheck: clean.
…I-8719 Phase B)
Swap all three apiBuilder call sites (loader, action, multi-method) from
authenticateApiRequestWithFailure + checkAuthorization to a single RBAC
plugin bridge. 30 route files migrated in lockstep — drop the
authorization.superScopes option, convert resource callbacks to return
RbacResource or RbacResource[] in the new shape.
Infrastructure:
- apiBuilder: new authenticateRequestForApiBuilder helper funnels all
three builders through rbac.authenticateBearer and follow-up
findEnvironmentById to rebuild the legacy ApiAuthenticationResultSuccess
shape handlers still expect (no handler-facing changes).
- @internal/rbac: re-export RbacAbility, RbacResource from
@trigger.dev/plugins so webapp only depends on @trigger.dev/rbac.
Route-file changes (Risk mitigations from the ticket):
- Custom actions (trigger, batchTrigger, update) unchanged at the route
level — the ACTION_ALIASES wrapper from Phase A handles them
transparently.
- Multi-key runs routes (api.v3.runs.$runId, realtime.v1.runs.$runId,
realtime.v1.streams.$runId.$streamId, api.v1.runs.$runId.events /
.spans.$spanId / .trace, realtime.v1.streams.$runId.input.$streamId
second block, plus the batch-array routes) now return
RbacResource[] — any resource match grants access. Undefined batch
ids are filtered out to avoid accidentally matching a type-level
read:batch scope with no id.
- Empty-resource routes (api.v1.batches, api.v1.idempotencyKeys.$key.reset)
now return { type: 'runs' } — JWTs with read:runs / write:runs start
working where they were previously denied by the legacy
empty-resource short-circuit. Intentional improvement, strict
superset of today's accept set.
- Search-params routes (realtime.v1.runs, api.v1.runs) return an array
with a collection-level { type: 'runs' } plus any filtered tag/task
entries so JWTs that work today continue to work.
Verification:
- pnpm run typecheck --filter webapp: clean.
- pnpm run test --filter @internal/rbac: 31 unit tests pass (wrapper +
array-resource semantics).
- E2E suite (test/api-auth.e2e.test.ts): all 31 tests pass on the new
code path — the three pre-migration 'behaviours to preserve' tests
(type-level write:tasks triggers a task, read:tags:<tag> reaches a
run with that tag, read:batch:<id> reaches a run in that batch) are
still green post-swap.
Packaging:
- .changeset/rbac-plugin-array-resources.md: minor bump for
@trigger.dev/plugins (array-resource overload on RbacAbility.can).
- .server-changes/rbac-apibuilder-migration.md: webapp-only note.
Add a session-auth route builder analogous to apiBuilder.server.ts that
routes dashboard auth through rbac.authenticateSession and runs the
ability check (canSuper or can) before the handler runs. Routes that
only need a logged-in user (no authorisation) keep using requireUser /
requireUserId — the builder is opt-in for routes with explicit auth.
Builder shape:
dashboardLoader({ authorization: { requireSuper: true } }, async ({ user, ability }) => ...)
dashboardLoader({ authorization: { action, resource } }, ...)
dashboardAction(...)
Auth failure throws a redirect Response so the success-path return type
stays narrow (useTypedLoaderData<typeof loader>() picks up the handler's
TypedJsonResponse). Optional context callback feeds organizationId /
projectId to authenticateSession when needed (enterprise-only — fallback
ignores context today).
Migrated 14 platform admin routes from
`requireUser` + `if (!user.admin)` to dashboardLoader / dashboardAction
with requireSuper: true:
admin.tsx
admin._index.tsx
admin.concurrency.tsx
admin.feature-flags.tsx
admin.notifications.tsx
admin.orgs.tsx
admin.data-stores.tsx
admin.back-office.tsx
admin.back-office._index.tsx
admin.back-office.orgs.$orgId.tsx
admin.llm-models._index.tsx
admin.llm-models.$modelId.tsx
admin.llm-models.new.tsx
admin.llm-models.missing._index.tsx
admin.llm-models.missing.$model.tsx
Routes that have admin-only sub-features (e.g. show-extra-fields-if-admin
on otherwise public routes) stay on requireUser. Migration of those is a
separate concern — they don't gate access on admin, they just branch
display.
Behavioural change: action handlers that previously threw
`new Response('Unauthorized', { status: 403 })` on non-admins now redirect
to / along with the loader. Uniform behaviour, but XHR fetchers that
expected a 403 status would now follow the redirect instead. The admin
pages migrated here don't appear to have XHR fetchers that depend on the
403, but worth flagging.
Verification:
- pnpm run typecheck --filter webapp: clean.
- pnpm run test --filter @internal/rbac: 31 unit tests pass.
- E2E suite: all 31 tests pass — including the
/admin/concurrency redirect test (now exercising the new builder).
Widen check.resource on the convenience methods to RbacResource | RbacResource[] so they match RbacAbility.can. Previously the interface declared only RbacResource on these methods, which left an inconsistency — anyone wanting to pass an array of resources had to call authenticateBearer + ability.can manually instead of using the convenience method. Surfaced when reviewing the cloud enterprise controller (TRI-8720), which had unilaterally widened its implementation to RbacResource[] and would have failed type-check if any caller routed an array through the typed interface. Updated: - packages/plugins/src/rbac.ts — RoleBaseAccessController interface. - internal-packages/rbac/src/fallback.ts — RoleBaseAccessFallback matches. - LazyController already uses Parameters<...> and tracks the interface, so it picks up the change automatically. @trigger.dev/plugins gets a minor bump (changeset added). Verification: - pnpm run typecheck across @trigger.dev/plugins, @trigger.dev/rbac, webapp — clean. - pnpm run test --filter @internal/rbac — 31 unit tests pass. - e2e suite unaffected (no signature change at runtime — pure type widening).
…suite (TRI-8732)
Foundation for TRI-8731. The smoke api-auth.e2e.test.ts spins up its own
webapp + Postgres container per test file (~30s startup each). The
comprehensive matrix would have 12+ files, so per-file startup would
dominate runtime. Instead this harness boots one container for the whole
suite and rapid-fires tests across multiple files.
Layout:
- vitest.e2e.full.config.ts — globalSetup + pool: forks. Picks up
test/**/*.e2e.full.test.ts.
- test/setup/global-e2e-full-setup.ts — calls startTestServer() once,
provides baseUrl + databaseUrl to test workers via vitest's
provide()/inject() API. Tears down on suite end.
- test/helpers/sharedTestServer.ts — getTestServer() pulls the provided
values, constructs a per-worker PrismaClient, exposes
{ webapp, prisma } matching the existing TestServer shape.
- test/helpers/seedTestSession.ts — produces a Cookie header value
compatible with the webapp's createCookieSessionStorage config so
dashboard tests (TRI-8742) can authenticate as a seeded user.
- test/auth-api.e2e.full.test.ts, test/auth-dashboard.e2e.full.test.ts,
test/auth-cross-cutting.e2e.full.test.ts — three file shells with
top-level describe blocks. Family subtasks (TRI-8733+) add nested
describes inside.
- .github/workflows/e2e-webapp-auth-full.yml — workflow_dispatch +
nightly schedule + pull_request paths-filtered (only triggers on PRs
touching auth-relevant files).
- test/README.md — documents the unit / smoke-e2e / full-e2e split.
Touching @internal/testcontainers:
- TestServer interface gains databaseUrl so per-worker PrismaClient
reconstruction has the connection string without going through the
serialised prisma instance (which can't cross worker boundaries).
- utils.ts — assertNonNullable's vitest import was previously eager at
module load. globalSetup runs outside any vitest worker, so that
eager init crashed (createExpect needs worker state). Switched to a
lazy require('vitest') inside the function body. The function still
runs in test workers where worker state exists.
- logs.ts — TaskContext changed to type-only import for the same
module-load-time concern (transitively imported by webapp.ts).
Verification:
- pnpm run typecheck across @internal/testcontainers + webapp — clean.
- pnpm exec vitest run --config vitest.e2e.full.config.ts —
3/3 tests pass in 19.37s with one observed container startup.
Subsequent family subtasks add describes with no per-file container
cost.
The placeholder it() in each file (just hits /healthcheck or counts
users) gets removed by the family subtasks as they add real coverage.
Mutation methods on RoleBaseAccessController now return discriminated
Result unions instead of throwing on expected error paths:
- RoleMutationResult — { ok: true; role: Role } | { ok: false; error }
for createRole, updateRole.
- RoleAssignmentResult — { ok: true } | { ok: false; error: string }
for deleteRole, setUserRole, removeUserRole, setTokenRole,
removeTokenRole.
The cloud webapp surfaces the 'error' strings directly to users
(system role edits, plan-tier gating, validation conflicts), so a
thrown exception now signals only an unexpected failure (DB outage,
bug). Read methods (getUserRole, getTokenRole, allRoles,
allPermissions) are unchanged.
OSS fallback returns { ok: false, error: 'RBAC plugin not installed' }
for every mutation — matches the prior behaviour (createRole/updateRole
already threw with this message; the others were silent no-ops, which
made misuse hard to detect). The LazyController in @internal/rbac
forwards the new return types verbatim. Changeset: patch.
Customer-facing surface: only public type widening of mutation method
return types — no runtime behaviour change for OSS callers (they get
a Result error instead of a thrown error or silent no-op; both indicate
'do not call these without the enterprise plugin').
The dev build was crashing with 'dashboardLoader is not a function'
on first navigation to any /admin route, then the browser would
hard-reload back to the previous page. Symptom: clicking 'Admin
dashboard' (or anywhere /@ → /admin chain) flashed admin then bounced
back, with no obvious cause server-side (every loader returned 200).
Root cause: routes export their loader at module top-level via the
wrapper:
export const loader = dashboardLoader(...);
The factory call evaluates at module load. dashboardBuilder lived in
a .server.ts file, which Remix strips from the client bundle. In the
prod build the loader export + its RHS are both tree-shaken, so the
import is unreferenced and removed — fine. In the dev build the call
is preserved (HMR/source-map friendliness) and resolves
dashboardLoader to undefined on the client, throwing on module load.
Remix's recovery is to reload the page, which lands on the previous
URL because that's the last known-good navigation entry.
Fix: split the wrapper so the import target exists on both server
and client.
- dashboardBuilder.ts (no .server) — exports types + dashboardLoader /
dashboardAction wrappers. Wrappers return closures whose bodies
dynamic-import the server impl. The closure body never runs on the
client, so the dynamic import only resolves at server runtime.
Client just sees a function that returns another function — the
top-level call now works there.
- dashboardBuilder.server.ts — slimmed down to authenticateAndAuthorize
+ the redirect/authorization helpers. Imported via dynamic import
from the wrapper. Stays out of the client bundle.
Routes drop the .server suffix on the import path. No change to the
route's loader/action surface. Verified end-to-end via Chrome
DevTools: /@ → /admin chain renders the admin dashboard cleanly,
no console errors, no extra document fetch back to the org URL.
…ting (TRI-8748) Wire RBAC into the existing org Teams page (settings/team). OSS plugin - Adds RoleBaseAccessController.getAssignableRoleIds(orgId) — the subset of allRoles(orgId) that can be assigned right now. Returns [] in the OSS fallback (consistent with allRoles also returning [] there). Pure UI affordance: server-side enforcement remains setUserRole's lookupAssignableRole. Public package change with patch-level changeset. Enterprise plugin - Implements getAssignableRoleIds against PlansClient: system roles pass through isSystemRoleAssignable (Owner/Admin always; Member / Viewer require Pro+); custom roles require canCreateCustomRoles (Enterprise tier). Mirrors the gates in setUserRole so UI and server agree. Webapp - TeamPresenter now also returns rbac.allRoles(orgId), getAssignableRoleIds(orgId), and per-member role assignments. Per-member is N+1 today (low-traffic settings page); a batched lookup is filed as a future optimisation. - Route migrated from requireUserId to dashboardLoader / dashboardAction via the split builder (commit a2cdbfb). Loader gates on read:members; action stays permissive at the wrapper level so the existing remove/leave + purchase-seats flows keep working with their per-intent checks. New set-role intent gates on manage:members and calls rbac.setUserRole — surfaces the Result error inline next to the dropdown when the server rejects (system role rename, plan-gated assignment, foreign-org role). - UI: native select next to each member, defaults to that member's current role. Options not in assignableRoleIds render disabled with an (upgrade) suffix. Auto-submits on change via fetcher. Invite + Remove buttons hide/disable when canManageMembers is false (server-side ability check pre-computed in the loader). Self-leave is always allowed regardless of manage:members. Verification - Typecheck clean across @internal/rbac, webapp, enterprise/plugins, enterprise/db, packages/plans. - Browser smoke test deferred until webapp dev server is running.
Adds a Role `<Select>` to the invite form. Dropdown options are
filtered by:
1. The inviter's own role — strictly below their level (Owner can
pick any of the 4; Admin can pick Developer or Member; Developer/
Member don't see the picker because they can't invite anyway).
2. The org's plan tier — `rbac.getAssignableRoleIds(orgId)` already
reflects this (Free/Hobby = Owner+Admin only, Pro+ unlocks
Developer+Member).
The picker is hidden entirely when `rbac.allRoles(orgId)` returns []
(OSS deployments with no plugin installed) — legacy invite path is
unchanged for self-hosters.
Schema: nullable `OrgMemberInvite.rbacRoleId text` column. On accept,
if it's set, `acceptInvite` calls the plugin's `setUserRole` after
the OrgMember insert (outside the Prisma transaction since the plugin
uses a separate Drizzle / postgres-js connection — same compensating
pattern as PAT-role assignment). If it's null, the runtime fallback
derives a role from the legacy `OrgMember.role` write at first
auth — no behaviour change.
Server-side validation in the action layer rejects:
- rbacRoleId not in `getAssignableRoleIds(orgId)` (plan-tier check).
- rbacRoleId at or above the inviter's own level (the
canInviteAtRole ladder).
Legacy `OrgMemberInvite.role` enum (ADMIN/MEMBER) is still written
based on the chosen RBAC role — Owner/Admin → "ADMIN", Developer/
Member → "MEMBER" — so OSS auth keeps working.
Verified:
- typecheck clean
- 162/162 OSS e2e.full
- 7/7 cloud enterprise e2e.full
Adds a new `systemRoleIds(): Promise<SystemRoleIds | null>` method on
the `RoleBaseAccessController` interface. Returns
`{ owner, admin, developer, member }` from any installed plugin and
`null` from the default fallback (matches the `allRoles → []`
semantics — there are no seeded roles to refer to in OSS).
Drops the `SYSTEM_ROLE_IDS` constant from `~/services/rbac.server` so
consumers can't reach for hardcoded role-id strings. Updates the four
sites that used it:
- `models/member.server.ts` (invite flow's legacy-role mapping)
- `routes/account.tokens` (PAT default)
- `routes/_app.orgs.$organizationSlug.settings.roles` (Roles page
comparison grid column ordering + plan-tier badges)
- `routes/_app.orgs.$organizationSlug.invite` (role picker)
The Roles page and invite route both pass the IDs through their
loaders rather than referencing them at module top level — which was
the root cause of the "Invite a team member button hard-refreshes the
dashboard" bug: importing a `.server.ts` symbol from client-rendered
code left a dangling client-bundle reference.
Verified: typecheck clean, 162/162 OSS e2e.full, 7/7 cloud
enterprise e2e.full.
SelectLinkItem passes render={<Link>} so the row navigates on click. In
non-Combobox Selects (most use cases — the role picker on the Teams
page, etc.) SelectItem was overriding render to undefined, silently
dropping the Link wrapper. Pass props.render through verbatim when
there's no Combobox; wrap in ComboboxItem only when one's present.
The OSS no longer needs to know individual role names. systemRoles(orgId) returns a plugin-owned, ordered SystemRole[] (id, name, description, available) — the cloud plugin owns the canonical order, the descriptions, and the per-org plan-tier 'available' flag. Hidden roles (Member in v1) are filtered out entirely. OSS callers iterate the array and use array index for the level ladder; no role-name strings except for the legacy OrgMember.role enum mapping shim, which is now isolated to one filter in member.server.ts.
…ker tightening #1 Batch trigger AND semantics (security): `api.v[12].tasks.batch` now uses `everyResource(...)` so a JWT scoped to taskA can no longer submit a batch that also includes taskB / taskC. Added an `everyResource` helper to `apiBuilder` (Symbol-marked wrapper that flips `ability.can` to `every`). Multi-key OR semantics still apply for single-resource arrays (a run carries multiple identifiers). Updated the e2e test to assert AND behaviour. #3 Realtime stream resource (correctness): `findResource` for `realtime.v1.streams.$runId.$streamId` now selects `taskIdentifier`, `runTags`, and `realtimeStreamsVersion` — fields the auth resource builder + handler read but findResource was returning undefined for. #4 projectCreated optional chaining (crash bug): added the missing `?.` between v3Subscription and plan so a missing subscription no longer throws and aborts project creation. #5 RBAC plugin loader logging: distinguish "plugin itself missing" from "plugin found but a transitive dep failed to resolve" by inspecting the ERR_MODULE_NOT_FOUND error message for the plugin's own module specifier. The transitive-dep case now logs at error level (matches the comment's stated behaviour). Removed the orphan log line that contradicted it. #6 account.tokens picker source mismatch: the picker now sources roles from the same plan-tier-filtered list (`systemRoles().filter(available)`) as the default-role calculation. Added server-side roleId revalidation in the create action so a hand-crafted POST can't bind a PAT to an unavailable role.
Six per-feature RBAC changesets were accumulating across the branch. The cumulative effect at release is purely additive — new methods on the controller contract, new accepted shapes on existing methods — so a single patch changeset captures it cleanly without the per-step narrative bleeding into the release notes.
Mirror the changeset consolidation: collapse five per-feature RBAC server-change files into one OSS-friendly entry covering the plugin system + auth/authz consolidation work.
UserRole is the source of truth for cloud; OSS doesn't need the legacy enum to carry the level. Drops the role-name lookup from member.server, which was the last place OSS had to know about specific role names.
Previously the picker filtered to roles strictly below the inviter, which meant Owners couldn't invite other Owners. Switch to at-or-below so peer-level invites (Owner inviting Owner, Admin inviting Admin) work — matches the existing user-facing copy and matches how most team-management UIs behave.
Adds a batch variant of getUserRole that returns a Map<userId, Role | null> in one round-trip. Used by TeamPresenter to drop the N+1 it had been documenting as a future optimisation. Org-scoped only — project-scoped reads still go through getUserRole (less common, not worth complicating the API for).
Adds an optional `group` field to Permission so the plugin labels and orders permission sections rather than the OSS hardcoding a name → group map. Section order on the Roles page now follows the order permissions appear in `allPermissions()` — first group seen, first rendered. Drops the previous PERMISSION_GROUP_BY_NAME / GROUP_ORDER constants from the webapp.
…mantics The 6 session routes merged via PR #3417 were authored against the pre-RBAC apiBuilder API: `authorization.resource` returned shapes like `{ sessions: 'abc' }`, with a parallel `superScopes: [...]` whitelist for broad-scope bypass. Post-TRI-8719, that shape doesn't typecheck and `superScopes` is dead code. Convert each resource callback to the canonical `{ type, id? }` shape. For the two routes whose resource type is `tasks` but whose old superScopes included `<action>:sessions` (list and create), use a multi-key array `[{ type: 'tasks', id }, { type: 'sessions' }]` so a JWT scoped `<action>:sessions` (no id) still passes — preserving the exact allow-set the old superScopes mechanism granted. `*:all` and `admin*` were already handled by the JWT ability's wildcard branches. Drop the now-dead `superScopes` field from all 9 entries. Adds e2e coverage in `auth-api.e2e.full.test.ts` (34 new tests, ~9 sub-describes) that locks in: per-task narrowing, `<action>:sessions` type-only equivalence to the old superScope, `*:all` and `admin*` bypass, wrong-action / wrong-id rejection. Plus a new `seedTestApiSession` helper for inserting Session rows via Prisma — distinct from the existing `seedTestSession` (cookie-session helper for dashboard tests).
Adds RoleBaseAccessController.authenticatePat — PATs identify the user;
the effective ability is min(user's role in target org, max-role cap).
The user's actual membership is the floor (auto-narrows on demotion or
removal); the cap is set at PAT creation as a deliberate ceiling.
OSS-side:
- @trigger.dev/plugins gains the PatAuthResult type + authenticatePat
on the controller interface.
- Fallback validates the PAT (prefix, hashed lookup, revoked check) and
returns a permissive ability — preserves the pre-RBAC behaviour where
PATs were pure user-identity tokens. Self-hosters see no change.
- LazyController delegates with the existing withActionAliases wrapper.
apiBuilder:
- createLoaderPATApiRoute accepts an optional context callback to
derive { organizationId?, projectId? } and an optional authorization
block. When either is declared, rbac.authenticatePat runs and the
ability flows into the handler. Routes that don't opt in stay on the
legacy permissive path.
- api.v1.projects.$projectRef.runs.ts opts in: context resolves
projectRef -> organizationId, authorization is read on type runs.
UI:
- account.tokens picker reframed as 'Maximum role' with a hint
explaining the cap-vs-floor model. Underlying TokenRole storage
unchanged; semantic flip from 'bound role' to 'max role cap'.
The role chosen at PAT creation now actually constrains the token
(previously TokenRole was written but never read at request time).
The workflow pinned pnpm 10.23.0; root package.json declares 10.33.2. pnpm/action-setup v4+ now rejects the conflict (root + workflow are both 'specified versions' as far as the action is concerned), failing the job before it can run any tests. Bump to v5.0.0 of the action and match the 10.33.2 pin already used by unit-tests-webapp.yml and release.yml.
The webapp unit-test config excluded *.e2e.test.ts (smoke matrix) but not *.e2e.full.test.ts. The full auth suite needs a globalSetup that spawns a webapp + Postgres container, which only the dedicated vitest.e2e.full.config.ts provides. CI's unit-test shards were picking up the e2e-full files via the include glob and failing immediately with 'globalSetup didn't provide baseUrl/databaseUrl'.
Five real-bug fixes from CodeRabbit + Devin review of #3499: #1 personalAccessToken.server.ts: FALLBACK_NOT_INSTALLED_ERROR string was 'RBAC fallback not installed' but the OSS fallback actually returns 'RBAC plugin not installed'. The mismatch made every PAT creation with a roleId hit the compensating-delete branch on self-hosters with no plugin installed — including the dashboard PAT-creation flow. Aligns the constant with the canonical string. #2 internal-packages/rbac/src/fallback.ts: authenticateBearer skipped the revoked-API-key grace window (RevokedApiKey table), so a freshly-rotated env API key would 401 immediately on the new auth path. Mirrors findEnvironmentByApiKey's fallback-to-RevokedApiKey logic so the auth-cross-cutting e2e tests pass. #3 api.v1.query.ts: multi-table queries built a plain RbacResource array, which checkAuth treats as 'any element passes'. A JWT scoped to one detected table could submit a query that also reads another table it isn't scoped to. Wrap with everyResource — same AND-semantics fix as the batch trigger routes. #4 account.tokens/route.tsx: defaultRoleId could land on a custom or plan-blocked role when userRoleId wasn't in the picker's assignable set. The action's submit-revalidation would then 400 until the user manually changed the dropdown. Clamp the default to roles the picker actually renders. #5 settings.team/route.tsx: the role Select used defaultValue, so a failed set-role submit left the attempted role visible while the server kept the old one. Switch to a controlled value bound to currentRoleId.
Locks in the everyResource(...) wrap added in 5547e66. Two new tests: JWT scoped to one of multiple detected tables → 403; JWT scoped to all detected tables → auth passes. Mirrors the every-task lock-in for the batch trigger routes.
…-resource sites
Bare RbacResource[] in `authorization.resource` is now a type error.
Multi-resource auth must wrap with one of:
- anyResource(...): succeed if any element passes (the existing default;
used when one record carries multiple identifiers — runs by friendlyId
/ batch / tags / task — so a JWT scoped to any one grants access)
- everyResource(...): succeed only if every element passes (existing
helper; used by batch operations and the multi-table query route)
The OR-loophole class of bug CodeRabbit caught on api.v1.query — a JWT
scoped to one of N detected tables was authorized for the whole multi-
table query — was patchable per-route with everyResource. The Symbol
marker stayed invisible: future authors would still default to bare
arrays. Tightening AuthResource flushed out 11 routes that were silently
on the OR path; each is wrapped explicitly now.
Also inline the unrelated private `anyResource` helper in
internal-packages/rbac/src/ability.ts so the public name is unambiguous.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| | { ok: false; status: 401; error: string } | ||
| | { ok: true; authentication: ApiAuthenticationResultSuccess; ability: RbacAbility } | ||
| > { | ||
| const result = await rbac.authenticateBearer(request, { allowJWT }); | ||
| if (!result.ok) { | ||
| return { ok: false, status: 401, error: result.error }; |
There was a problem hiding this comment.
🟡 authenticateRequestForApiBuilder hardcodes HTTP 401, discarding a potential 403 from the RBAC layer
The authenticateRequestForApiBuilder bridge function at apiBuilder.server.ts:61 always returns status: 401 when the RBAC result is ok: false, even though BearerAuthResult (defined in packages/plugins/src/rbac.ts:96) allows status: 401 | 403. The return type is explicitly narrowed to status: 401 at line 56. A plugin's authenticateBearer could legitimately return 403 for cases like a suspended account or IP-blocked request — the wrapper silently converts that to 401, which is semantically wrong (401 = "who are you?" vs 403 = "you're not allowed") and would confuse debugging and client retry logic.
| | { ok: false; status: 401; error: string } | |
| | { ok: true; authentication: ApiAuthenticationResultSuccess; ability: RbacAbility } | |
| > { | |
| const result = await rbac.authenticateBearer(request, { allowJWT }); | |
| if (!result.ok) { | |
| return { ok: false, status: 401, error: result.error }; | |
| | { ok: false; status: 401 | 403; error: string } | |
| | { ok: true; authentication: ApiAuthenticationResultSuccess; ability: RbacAbility } | |
| > { | |
| const result = await rbac.authenticateBearer(request, { allowJWT }); | |
| if (!result.ok) { | |
| return { ok: false, status: result.status, error: result.error }; |
Was this helpful? React with 👍 or 👎 to provide feedback.
| const [scopeAction, scopeType, scopeId] = scope.split(":"); | ||
| if (scopeAction === "admin") return true; | ||
| if (scopeAction !== action && scopeAction !== "*") return false; |
There was a problem hiding this comment.
🔴 buildJwtAbility treats any admin:TYPE scope as a universal wildcard, broadening access beyond old behavior
In ability.ts:30, the JWT ability builder checks scopeAction === "admin" and immediately returns true for any action/resource. Because the scope is split on :, a scope like admin:sessions produces scopeAction = "admin" — making it equivalent to the bare admin scope (universal access). In the old checkAuthorization system (authorization.server.ts:74-78), admin:sessions was treated as an opaque superScope string that only matched routes explicitly listing it, so a JWT with scope admin:sessions could NOT access routes with superScopes: ["read:runs", "admin"]. Post-migration, that same JWT passes auth on every route. While the practical likelihood of a customer minting JWTs with admin:TYPE scopes (rather than plain admin) is low, this is a silent privilege escalation for any such tokens.
Example of the behavioral difference
Old: JWT {scopes: ["admin:sessions"]} on GET /api/v1/runs → superScopes ["read:runs","read:all","admin"] → "admin:sessions" not in superScopes → denied (403)
New: same JWT → buildJwtAbility(["admin:sessions"]) → scopeAction === "admin" → returns true → authorized
Prompt for agents
The `buildJwtAbility` function in internal-packages/rbac/src/ability.ts splits scopes on `:` and checks whether `scopeAction === "admin"` at line 30, returning true unconditionally. This means any scope starting with `admin:` (e.g. `admin:sessions`, `admin:tasks`) is treated as a universal wildcard, equivalent to plain `admin`. This is broader than the old superScopes behavior where `admin:sessions` only matched routes that explicitly listed it.
To fix: the `admin` wildcard check should only fire when the scope is exactly the string `admin` (single segment, no type/id parts). When the scope has additional segments like `admin:sessions`, it should be treated like a normal scope where `admin` is the action and `sessions` is the type — i.e., it grants `admin` action on `sessions`-type resources only. One approach: check `if (scopeAction === "admin" && !scopeType) return true;` to limit the wildcard to plain `admin`. Then let the normal type/id matching logic handle `admin:sessions` as action=admin, type=sessions.
Was this helpful? React with 👍 or 👎 to provide feedback.
| if (isEveryResource(resource)) { | ||
| return resource.resources.every((r) => ability.can(action, r)); | ||
| } |
There was a problem hiding this comment.
🚩 everyResource([]) is a vacuous-truth authorization bypass
In apiBuilder.server.ts:160-161, checkAuth delegates to resource.resources.every(...). JavaScript's Array.prototype.every returns true for empty arrays (vacuous truth). This means everyResource([]) passes auth for ANY JWT, even one with zero scopes. The batch trigger routes (api.v1.tasks.batch.ts, api.v2.tasks.batch.ts) build the everyResource array from body.items, so an empty items array produces everyResource([]). In practice this isn't exploitable because: (1) body parsing happens before auth in createActionApiRoute, and the Zod schema may reject empty arrays; (2) even if auth passes, the handler explicitly returns 400 for empty batches. However, the auth layer "leaking" a 400 instead of 403 for empty batches is a minor information disclosure (confirms the route exists and accepts batches). Consider adding a guard: if (resource.resources.length === 0) return false;.
Was this helpful? React with 👍 or 👎 to provide feedback.
| let env = await this.prisma.runtimeEnvironment.findFirst({ | ||
| where: { apiKey: rawToken }, | ||
| include, | ||
| }); |
There was a problem hiding this comment.
🚩 Fallback controller does not check branch-name resolution for API keys
The original findEnvironmentByApiKey in models/runtimeEnvironment.server.ts accepts a branchName parameter and resolves branch environments via childEnvironments includes. The new fallback's authenticateBearer (internal-packages/rbac/src/fallback.ts:88-91) looks up the environment by apiKey alone without branch resolution. However, this is mitigated because authenticateRequestForApiBuilder at apiBuilder.server.ts:67 calls findEnvironmentById(result.environment.id) which loads the full environment shape but also doesn't resolve branches. Branch resolution was handled by the old authenticateApiKey function via the x-trigger-branch header, and this PR doesn't change the header extraction. Worth verifying that branch-based routing still works end-to-end if it's used.
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
Consolidates the webapp's authentication and authorization into a small set of route helpers, replacing the ad-hoc
requireUser/requireUserId/authenticatedEnvironmentForAuthenticationcalls scattered across routes. Same security model, but the per-request flow (authenticate → authorize → load) now lives in one place per route family.Adds a comprehensive end-to-end auth test suite that didn't exist before — 162 tests covering API key, PAT and JWT auth across the public API surface, plus dashboard session auth for admin pages.
Changes
Dashboard auth (started, partial rollout)
Admin and settings pages migrated to a unified loader/action helper that authenticates the session, runs an authorization check, and exposes the result to the route. Other dashboard routes still on the old pattern; remaining migration tracked separately.
Migrated routes:
admin.*(14 admin / back-office / feature-flags / LLM-models / notifications / orgs / concurrency pages)_app.orgs.$organizationSlug.settings.team_app.orgs.$organizationSlug.settings.rolesAPI / realtime / engine auth (complete for the migrated families)
71 routes migrated to a unified
apiBuilderthat centralizes Bearer / PAT / Public-JWT authentication and applies the per-route authorization check before the handler runs. Includes:api.v1.*andapi.v2.*andapi.v3.*— tasks, runs, batches, queues, prompts, deployments, query, sessions, waitpoints, packets, workers, idempotency keysrealtime.v1.*— runs, batches, sessions, streamsengine.v1.*— dev / worker-action protocolsSide effect: action aliases preserved historic JWT scope semantics where the new model is stricter (e.g. a
write:tasksJWT now also satisfiestrigger/batchTrigger/updateactions on the same resource — matched at the auth boundary, not in the route handler).Auth test suite (new —
*.e2e.full.test.ts)162 e2e tests run against a real spawned webapp + Postgres (no mocks). Coverage matrix:
Test plan
pnpm run typecheck --filter webappcleanpnpm exec vitest run --config apps/webapp/vitest.e2e.full.config.ts— 162/162 pass