RAXE-2026-054: Paperclip Agent Runtime and Tenant Boundary Collapse: Multi-Advisory Disclosure Burst

At a glance

The issue: Eleven vulnerabilities have been disclosed against Paperclip AI, a platform for running AI agents. Four are critical, and they chain together to let an unauthenticated attacker take over another tenant's agents from a fresh account.
Who's affected: Anyone running Paperclip AI (paperclipai CLI or @paperclipai/server) before version 2026.416.0, especially multi-tenant deployments reachable from the network.
What to do now: Upgrade all four @paperclipai/* npm packages to 2026.416.0 in a single coordinated step.

Executive Summary

Eleven GitHub-reviewed advisory IDs covering eight distinct failure primitives have been published against the Paperclip AI agent-orchestration platform in a single disclosure window (GHSA-47wq-cj9q-wpmp and GHSA-3xx2-mqjm-hg9x describe the same /agents/:id/keys primitive, see Primitive B, and may be overlapping reports). Four of the eleven advisory IDs carry critical ratings (CVSS 9.8-10.0), and the cluster collapses into a coherent story: the authorisation checks on Paperclip's agent-management APIs stop at the role (assertBoard) without extending to the tenant boundary (assertCompanyAccess), while several agent-supplied configuration fields are executed directly by the server host as shell commands. A short path from "freshly signed-up board user" to "remote code execution inside any victim tenant's agent context" is composable with standard reconnaissance from the advisory-disclosed mechanics, though no single advisory describes that composition end-to-end.

What to do: Upgrade every Paperclip package to 2026.416.0. Early advisory/cache data for GHSA-68qg cited 2026.410.0, but no stable 2026.410.0 npm release was published; current GHSA/NVD records and the npm registry converge on 2026.416.0 as the stable fixed version.

Risk Rating

Dimension	Rating	Detail
Severity	CRITICAL	Four of eleven advisory IDs at `CVSS 9.8`-10.0; cross-tenant RCE reachable from a fresh account on a publicly-exposed instance
Urgency	HIGH	All critical fixes shipped at `2026.416.0`; patched versions available now
Scope	Product-wide	Four npm packages affected (`paperclipai`, `@paperclipai/server`, `@paperclipai/shared`, `@paperclipai/ui`)
Confidence	High	All eleven advisories GitHub-reviewed; RAXE assessment corroborated via direct repository security-advisory API query
Business Impact	High	Cross-tenant compromise in multi-tenant deployments; host RCE in default desktop mode

Affected Products

Package	Affected versions	Fixed version	Advisories
`@paperclipai/server`	`< 2026.416.0`	`2026.416.0`	GHSA-68qg, -47wq, -3xx2, -vr7g, -265w, -xfqj, -w8hx, -p7mm
`@paperclipai/shared`	`< 2026.416.0`	`2026.416.0`	GHSA-3pw3
`@paperclipai/ui`	`< 2026.416.0`	`2026.416.0`	GHSA-fpw4
`paperclipai`	`< 2026.416.0`	`2026.416.0` (see §Version Discrepancy)	GHSA-68qg, -gqqj

Am I Affected?

Check if you run Paperclip AI: any deployment of the paperclipai CLI or the @paperclipai/server control plane.
Check your version: npm ls -g paperclipai and npm ls -g @paperclipai/server (or inspect your container image tag).
Check your deployment mode: local_trusted (desktop default, zero authentication on the HTTP API) or authenticated (multi-user, multi-tenant). Some advisories reach exploitation in both modes; one is exploitable without credentials in local_trusted.

Abstract

This report consolidates eleven GitHub-reviewed advisory IDs against Paperclip AI published in a single disclosure window and covering eight distinct failure primitives (two advisories, GHSA-47wq-cj9q-wpmp and GHSA-3xx2-mqjm-hg9x, describe the same primitive and may be overlapping reports; see Primitive B). Four carry critical ratings (CVSS 9.8-10.0). The cluster describes two recurring architectural failures: authorisation checks that assert the caller's role but not the caller's tenant, and agent-supplied configuration fields that are executed directly by the server host as shell commands. The disclosure set additionally documents open deployment defaults, unauthenticated administrative endpoints, an unfixed cross-runtime connector inheritance issue involving a host's ChatGPT/OpenAI Gmail connector, a stored cross-site scripting sink in the shared Markdown renderer, and an approval-attribution integrity failure. Remediation is a single coordinated upgrade to 2026.416.0 across four npm packages; operators should target 2026.416.0 even where older downstream mirrors still mention the unreleased 2026.410.0 stable version.

Key Findings

1. Cross-tenant RCE is composable in six HTTP calls

A fresh account on any network-reachable Paperclip instance running in authenticated mode with default configuration can reach remote code execution inside another tenant's agent context by composing four advisories (GHSA-68qg + GHSA-47wq/GHSA-3xx2 + GHSA-265w) with no user interaction and no pre-existing credentials.

2. Agent-supplied config becomes host shell commands

Three distinct advisories (GHSA-vr7g, GHSA-265w, GHSA-w8hx) describe the same architectural failure mode: a configuration field reachable from an agent or workspace API ends up in spawn(shell, ["-c", input]) on the Paperclip server host. The consistent root cause across all three sinks is the absence of allowlisting or escaping between agent-reachable configuration and the shell invocation.

3. The 2026.410.0 patch did not finish the job

GHSA-47wq explicitly notes that the earlier 2026.410.0 patch for the unauthenticated-RCE chain did not cover the /agents/:id/keys handler class, the same class of missing assertCompanyAccess check, on a different handler family. The complete fix required a second patch round at 2026.416.0.

4. The Gmail-connector inheritance advisory is unfixed and unreproduced

GHSA-gqqj-85qm-8qhf reports that a Paperclip-managed codex_local runtime was able to access a Gmail connector the reporter had configured in the ChatGPT/OpenAI apps user interface, without the reporter having connected Gmail inside Paperclip. The GitHub Advisory Database lists no patched version. RAXE has not independently reproduced this behaviour.

Attack Flow

The chain below is composable from the advisory-disclosed mechanics of four distinct advisories (GHSA-68qg, GHSA-47wq/GHSA-3xx2, GHSA-265w). No single advisory describes it end-to-end. Step 4 presupposes knowledge of a victim agent.id (a UUID); GHSA-xfqj enumerates one unauthenticated reconnaissance endpoint in authenticated mode (/api/heartbeat-runs/:runId/issues) but the advisory text does not guarantee that endpoint discloses agent UUIDs specifically rather than issue UUIDs. Legitimate cross-tenant discovery surfaces (if exposed) and post-sign-up enumeration are the most likely bridges; the advisories do not specify. Network reachability to the control-plane API is also a prerequisite, whether Paperclip's default port 3100 is publicly exposed is a deployment choice; RAXE has no Shodan/Censys data on prevalence.

  Starting state -- Unauth attacker, network-reachable
           │
           │  (1) POST /api/auth/sign-up/email  → account
           │      Default: open sign-up, email verification hardcoded off
           │                                              (GHSA-68qg flaw 1)
           ▼
  Stage 1 -- Authenticated board user, 0 company memberships
           │
           │  (2) POST /api/cli-auth/challenges            → token
           │  (3) POST /api/cli-auth/challenges/<id>/approve
           │      Self-approval is not rejected            (GHSA-68qg flaw 2)
           ▼
  Stage 2 -- Board user with boardApiToken
           │
           │  (4) POST /api/agents/<victim-id>/keys
           │      assertBoard yes; assertCompanyAccess NO  (GHSA-47wq/3xx2)
           │      Returns plaintext pcp_* token bound to VICTIM companyId
           ▼
  Stage 3 -- Effective agent actor in victim tenant
           │
           │  (5) PATCH /api/agents/<victim-id>
           │      Body: {adapterConfig:{workspaceStrategy:
           │             {provisionCommand: "<attacker shell>"}}}
           │      Schema accepts unconstrained adapterConfig  (GHSA-265w)
           │
           │  (6) POST /api/agents/<victim-id>/wakeup
           │      Server: spawn("/bin/sh",["-c", attacker shell])
           ▼
  Final state -- RCE on Paperclip server host in victim tenant's agent context

In local_trusted mode the chain is shorter: the cleanupCommand injection (GHSA-vr7g) is reachable unauthenticated in five HTTP calls, entirely skipping the sign-up and key-mint stages.

Technical Details

The eleven advisories collapse into eight primitives. This section is organised by primitive; each primitive lists the advisory IDs that comprise it and the disclosed source-level root cause.

Primitive A, Unauthenticated RCE chain (GHSA-68qg, CVSS 10.0)

A four-flaw composition reachable in six API calls against an instance running authenticated mode with default configuration. In order:

Open sign-up. server/src/config.ts:169-173 defaults authDisableSignUp to false. The PAPERCLIP_AUTH_DISABLE_SIGN_UP environment variable exists but is not documented in the deployment guide.
Hardcoded-off email verification. server/src/auth/better-auth.ts:89-93 sets requireEmailVerification: false at source. Accounts are usable immediately.
CLI-auth challenge self-approval. server/src/routes/access.ts:1638-1659 accepts challenge-creation requests with no actor check. The matching approval handler at lines 1687-1704 requires a board session but does not reject the case of approver equalling creator. The attacker approves their own challenge and obtains a persistent boardApiToken.
Import-path authorisation bypass. The direct POST /api/companies route requires instance_admin; the import endpoint does not. A newly signed-up board user deploys an agent inside a company they do not own.

CWEs: CWE-287 (Improper Authentication), CWE-862 (Missing Authorization), CWE-1188 (Insecure Default Initialization).

Primitive B, Cross-tenant `/agents/:id/keys` tenancy-boundary collapse (GHSA-47wq + GHSA-3xx2, CVSS 10.0)

These two advisories cite the same three handlers at the same source location (server/src/routes/agents.ts:2050-2087) with the same missing-check root cause. They may be overlapping reports (both are authored under the Paperclip CNA's GHSA repository advisory path) rather than distinct root causes. Three handlers, GET, POST, DELETE /agents/:id/keys, call only assertBoard(req), which checks req.actor.type === "board". They never call assertCompanyAccess(req, agent.companyId). The handler 12 lines below (POST /agents/:id/wakeup) shows the correct pattern: fetch the agent first, then scope-check. The three /keys handlers do not even fetch the agent.

The service layer (server/src/services/agents.ts:580-629) binds the minted token to the victim agent's companyId. After mint, every assertCompanyAccess inside the victim tenant succeeds for the attacker's bearer token.

GHSA-47wq explicitly notes the earlier 2026.410.0 patch for GHSA-68qg did not cover this handler family, the same class of missing check, different handler.

CWEs: CWE-285, CWE-639, CWE-862, CWE-1220.

Primitive C, Agent-to-host OS command execution (three advisories)

Three advisories share one root cause: spawn(shell, ["-c", input]) over agent- or workspace-controlled strings.

cleanupCommand injection (GHSA-vr7g, CVSS 9.8). PATCH /api/execution-workspaces/:id accepts an unvalidated config.cleanupCommand field. On workspace archive, server/src/services/workspace-runtime.ts (~line 738) executes each command via spawn(shell, ["-c", command]). Shell resolution uses process.env.SHELL or falls back to "sh". Per advisory text, on Windows Paperclip's Git prerequisite supplies sh.exe, so the injection is cross-platform (RAXE has not independently tested Windows reproduction). Unauthenticated in local_trusted mode.
workspaceStrategy.provisionCommand (GHSA-265w, CVSS 8.8). PATCH /api/agents/:id accepts adapterConfig: z.record(z.unknown()), an unconstrained object schema. Any agent-API-key holder writes adapterConfig.workspaceStrategy.provisionCommand; during provisioning, the server runs spawn("/bin/sh", ["-c", command]). Agent-runtime boundary collapses into host execution.
Malicious-skill workspace-runtime execution (GHSA-w8hx, CVSS 7.3). A malicious skill loaded into an agent invokes the workspace runtime service feature, exposing the server process's environment variables including API keys, JWT secrets, and database credentials.

CWEs: CWE-78 (OS Command Injection), CWE-77 (Command Injection).

Primitive D, `codex_local` cross-connector credential inheritance (GHSA-gqqj, `CVSS 8.7`)

This section describes an unreproduced advisory report, not a RAXE-verified finding. The GitHub-reviewed advisory, based on private reporter evidence that RAXE has not independently reproduced, reports that a Paperclip-managed codex_local runtime was able, in the reporter's environment, to access and use a Gmail connector the reporter had configured in the ChatGPT/OpenAI apps UI, without the reporter having connected Gmail inside Paperclip or separately inside Codex. Per the reporter's account, in that specific environment this enabled mailbox access and a real outbound email.

The advisory is a behavioural report in one reporter's environment, not a demonstration of an OpenAI-side API vulnerability, a Codex-side credential store issue, or a cross-tenant leak in a multi-tenant Paperclip instance. The reported mechanism, a Paperclip-managed codex_local runtime accessing host-level OAuth state configured by another desktop application, is consistent with several root causes including host-process session reuse, environment-variable inheritance, or shared-filesystem credential caches. None of those root causes can be attributed without reproduction.

No fixed version is listed in the GitHub Advisory Database and there is no package upgrade that closes this issue. Operational mitigation is limited: do not colocate codex_local-launching Paperclip deployments with host OS user accounts that hold OAuth sessions for other AI desktop applications.

CWE: CWE-284 (Improper Access Control).

Primitive E, Unauthenticated endpoints in `authenticated` mode (GHSA-xfqj, `CVSS 8.3`)

Several API endpoints in authenticated mode accept requests with no account, no session, and no API key, and either return sensitive data or perform state-changing operations. The advisory enumerates GET /api/heartbeat-runs/:runId/issues as one unauthenticated data-read example. Practical impact: reconnaissance support for the other primitives. The advisory set shows unauthenticated metadata exposure and API-shape discovery, but does not guarantee direct disclosure of a victim agent.id; that bridge remains a prerequisite for the composed cross-tenant chain above.

CWE: CWE-306 (Missing Authentication for Critical Function).

Primitive F, Agent-controlled arbitrary file read (GHSA-3pw3, `CVSS 6.5`)

An agent-API-key holder writes adapterConfig.instructionsFilePath to any path on the Paperclip host filesystem. The server runtime reads this path and returns its contents through the agent's instructions pipeline. Fixed at @paperclipai/shared 2026.416.0.

CWE: CWE-73 (External Control of File Name or Path).

Primitive G, Stored XSS via `urlTransform` override (GHSA-fpw4, `CVSS 5.4`)

MarkdownBody, the shared Markdown renderer used across Paperclip's UI, including issue documents, comments, chat threads, approvals, agent details, and export previews, passes urlTransform={(url) => url} to react-markdown. This identity function replaces react-markdown's defaultUrlTransform, which is the library's only built-in defence against javascript:, vbscript:, and data: URLs. A payload of the form [Click me](javascript:alert(document.domain)) embedded in any Markdown surface fires on click in the viewing user's browser context. Because approvals and chat threads are cross-user surfaces by design, this is a stored XSS, not reflected.

CWE: CWE-79 (Cross-site Scripting).

Primitive H, Approval attribution spoofing (GHSA-p7mm, `CVSS 4.3`)

The approval-resolution endpoints (POST /approvals/:id/approve|reject|request-revision) accept a client-supplied decidedByUserId field in the request body and write it verbatim into the authoritative approvals.decidedByUserId column with no cross-check against the authenticated actor. Any board user in the approval's company can record the decision as having been made by any other user. Not an access-boundary failure but a provenance-integrity failure. Downstream automations that rely on decidedByUserId, illustrative examples include deployment gates, spend authorisations, and PII-disclosure approvals, if present in a customer's deployment, may have a falsifiable audit trail; the advisory does not enumerate which downstream consumers actually exist in Paperclip deployments.

The fix drops the client-supplied field and substitutes req.actor.userId server-side.

CWE: CWE-345 (Insufficient Verification of Data Authenticity).

Confidence & Validation

Assessment Confidence: High.

Aspect	Status	Detail
Vendor Advisory	Published (×11)	All eleven are GitHub-reviewed via the Paperclip repository security-advisory API
CVE Assigned	Partial	Two CVEs have been assigned as of 2026-04-27: `CVE-2026-41679` for GHSA-68qg and `CVE-2026-41208` for GHSA-265w. The original 2026-04-19 validation snapshot had 0/11 CVEs.
PoC Available	Partially public	GHSA-68qg's reporter cites a fully automated PoC and video; GHSA-vr7g includes three independent reproductions; the other advisories include source-level root-cause detail
Patch Available	Yes (10/11)	Ten advisories fixed at `2026.416.0`; GHSA-gqqj has no listed fix
Exploited in Wild	Not observed	No public reports of exploitation in the wild as of 2026-04-19

Detection Signatures

Six Sigma rules are published alongside this report (detection/paperclip-cluster.yml). Summary of coverage:

Rule ID	Primitive	Level
`5dbf08a1-…-054`	CLI-auth challenge self-approval (GHSA-68qg)	high
`5dbf08a1-…-055`	Cross-tenant `/agents/:id/keys` (GHSA-47wq/3xx2)	critical
`5dbf08a1-…-056`	`cleanupCommand` injection (GHSA-vr7g)	critical
`5dbf08a1-…-057`	`provisionCommand` injection (GHSA-265w)	high
`5dbf08a1-…-058`	`instructionsFilePath` file read (GHSA-3pw3)	high
`5dbf08a1-…-059`	Approval attribution spoofing (GHSA-p7mm)	medium

Three advisories are not covered by web-log Sigma:

GHSA-gqqj has no network-reachable payload to detect.
GHSA-w8hx (malicious skill) is indistinguishable from normal skill invocation at the HTTP layer; detection requires host-process or skill-audit coverage.
GHSA-fpw4 is a client-side render issue; detection at the server log layer would require Markdown-content inspection on every write, which is better addressed by the fix.

Detection & Mitigation

Primary mitigation

Upgrade to 2026.416.0 across all four packages:

npm update -g paperclipai @paperclipai/server @paperclipai/shared @paperclipai/ui
# Pin at 2026.416.0 in Docker tags or lockfiles until confirmed clean

A single coordinated upgrade closes every advisory with a published fix.

Hardening for deployments still on a vulnerable version

Do not expose the HTTP API (default port 3100) to untrusted networks.
In authenticated mode, set PAPERCLIP_AUTH_DISABLE_SIGN_UP=true at the process environment. This setting closes the account-creation step of the GHSA-68qg chain. The setting is functional but is not documented in the deployment guide per advisory text.
Do not colocate codex_local-launching Paperclip instances with host OS user accounts that hold OAuth sessions for other AI desktop applications.

Operational detections

Alert on any adapterConfig mutation originating from an agent-type actor; legitimate adapter-configuration changes are platform-issued.
Enable audit logging on /agents/:id/keys and require instance-admin review of cross-company key minting until upgrade is confirmed.
Review approvals.decidedByUserId entries for inconsistency with authenticating-actor logs until the @paperclipai/server upgrade lands.

Version Discrepancy

Early GHSA-68qg cache data fetched on 2026-04-19 declared patched_versions = ["2026.410.0"] for both the paperclipai meta-package and the @paperclipai/server package. A query against the npm registry (https://registry.npmjs.org/paperclipai) on 2026-04-19 showed 191 published versions; the range 2026.410.x contained only canary builds (2026.410.0-canary.0, 2026.410.0-canary.1). No stable 2026.410.0 was released for either package. Current GHSA/NVD records now list 2026.416.0 as the fixed version, aligning the advisory record with the first stable release that includes the full patch set.

Operators should target 2026.416.0, not 2026.410.0. Downstream content that cites 2026.410.0 as an installable version will fail on npm install and should be corrected.

Indicators of Compromise

Type	Indicator	Context
HTTP request	`POST /api/cli-auth/challenges` followed by `POST /api/cli-auth/challenges/<id>/approve` within 60 s, same source	GHSA-68qg self-approval chain
HTTP request	`POST /api/agents/<uuid>/keys` where caller's company set does not include the target agent's company	GHSA-47wq/GHSA-3xx2 cross-tenant mint
HTTP body	`"cleanupCommand"` containing shell metacharacters (`&&`, `;`, `\|`, `$(`, backtick) or references to `/tmp/`, `curl`, `wget`, `powershell`, `calc.exe` in `PATCH /api/execution-workspaces/<uuid>`	GHSA-vr7g
HTTP body	`"provisionCommand"` or `"workspaceStrategy"` inside `adapterConfig` on `PATCH /api/agents/<uuid>` from an `agent` actor	GHSA-265w
HTTP body	`"instructionsFilePath"` containing `/etc/`, `/root/`, `/proc/`, or `..` traversal on `PATCH /api/agents/<uuid>`	GHSA-3pw3
HTTP body	`"decidedByUserId"` on `POST /api/approvals/<uuid>/{approve,reject,request-revision}` where the value differs from the authenticated actor	GHSA-p7mm

Strategic Context

Three structural observations distinguish this cluster from the higher-volume stream of isolated supply-chain CVEs in AI-agent tooling:

Two distinct boundaries, tenant isolation and agent-runtime isolation, fail through the same class of missing check. Authorisation gates on /agents/:id/keys verify the caller is a board user but not that the target agent belongs to the caller's company; the agent-configuration schema accepts an unconstrained adapterConfig.workspaceStrategy.provisionCommand without constraining what the agent is allowed to instruct the server host to do. In combination, a tenant-crossing identity (Primitive B) and an agent-controlled host command (Primitive C-2) form a single attack path. This is not the typical supply-chain story, an unsanitised --mcp argument, an unsafe model deserialiser, a YAML loader with code execution, where fixing the dependency graph closes the issue. The fix here is at the handler layer of the multi-tenant product's own control-plane API, and it must extend to every route that touches cross-company resources.
The patch set required two iterations. GHSA-47wq explicitly notes that the 2026.410.0 patch for the unauthenticated-RCE chain did not cover the /agents/:id/keys handler family. A reviewer enumerating every endpoint that touches cross-company resources would have caught both classes in the same pass. The practical implication for defenders is that a post-patch regression scan of authorization routes is warranted after critical advisories at platform layer, not just a version check.
RAXE assessment (low-confidence): the disclosure density is consistent with coordinated discovery. Eleven advisory IDs of varying severity against one product over a compressed window, with an explicit reproduction build identifier (commit b649bd4 = 2026.411.0-canary.8) cited in one of them, is consistent with an external audit engagement but also with an internal security-team audit cycle or a bug-bounty relaunch. RAXE has not confirmed which. If the reporter fields across the eleven advisories show disparate individuals rather than a common auditor or firm, the coordinated-audit reading weakens to "a product that previously had no active disclosure programme catching up". RAXE has not checked the reporter-attribution fields. Teams using similar agent-orchestration products should still plan for a multi-advisory patch cycle rather than a single CVE response, that guidance follows from the partial-patch observation in point 2 regardless of the origin of the disclosures.