MCP Gateway — CyberOS

MCP Gateway is the tool federation layer that turns CyberOS's 22 modules into a single, coherent MCP server. Each module publishes a per-server (named cyberos.brain, cyberos.skill, cyberos.crm, …) that exposes its verbs as tools (brain.put_memory, skill.invoke_skill, crm.update_account, …); the gateway aggregates these into a federated surface that Claude / Codex / Cursor / Cline / any 2025-11-25-spec client sees as one server. OAuth 2.1 + PKCE (RFC 7636) gates every tool call; the audience claim pins the call to a specific module; the RBAC predicate from AUTH enforces who can do what; the persona-version stamp captures which agent authored the call. Tool annotations (destructive · readOnly · idempotent · openWorld) drive human-confirm gating. Tasks primitive handles long-running work; Elicitation reverses control mid-execution to ask the user a question.

Status

Planned

P0 · design phase · M+2

Spec compliance

MCP 2025-11-25

latest spec, full coverage

Transports

Streamable HTTP · SSE

RFC 9112 chunked + EventSource

Auth

OAuth 2.1 PKCE

RFC 7636 · audience-bound

Tools at P0 (est.)

~80

across BRAIN, Skill, AUTH, AI

Tools at P3 (est.)

~300+

22 modules × ~15 verbs each

Depends on

AUTH · BRAIN · OBS

authz + audit + traces

Naming convention

SEP-986

verbNoun.dotted

Why MCP Gateway exists

Per-module MCP servers create N integration points where every AI agent has to negotiate auth, discover tools, and handle errors. As N grows to 22, the agent's job becomes intractable. The gateway pattern collapses N into 1: agents see a single MCP server with one OAuth flow, one tool catalogue, one discovery endpoint. The 22 modules continue to publish their per-server tools; the gateway federates them. Naming (cyberos.brain.put_memory) preserves the module origin so audit + revocation remain per-module.

🌐

One discovery, all tools

Agents hit /.well-known/mcp once and discover every module's tools. Per-module servers stay separate; federation is at the edge.

🛡

OAuth-protected, RBAC-evaluated

PKCE-only OAuth 2.1; every tool call audience-bound; every call evaluated by AUTH RBAC; destructive calls human-gated.

⏳

Long-running work, first-class

The Tasks primitive supports operations that exceed the request timeout — gateway polls underlying server and reports progress to caller.

The bet: pay the cost of one good federation once. Without MCP Gateway, every agent client (Claude Desktop, Cursor, Cline, custom) has to maintain its own list of N module URLs, N OAuth registrations, N rate-limit handlings. With MCP Gateway, that list collapses to one URL — and replacing a module is a federation-registry update, not a client release.

What it does — 5W1H2C5M

PRD §8.4 + §9.8 give the full spec. This table is the working summary.

Axis	Question	Answer
5W · What	What is MCP Gateway?	A Rust-axum service that implements the MCP 2025-11-25 spec on the edge, federates tool calls to per-module backends, enforces OAuth 2.1 PRM, applies tool annotations, manages Tasks for long-running work, proxies Elicitation prompts, and emits one audit row per call.
5W · Who	Who calls it?	External agents (Claude Desktop / Cursor / Cline / Codex) and internal agents (CUO when invoking Skill, scheduled tasks). Owner: CTO seat (interim CEO).
5W · When	When is it hit?	(a) at agent session start (discovery + auth); (b) on every tool call. P0 expected RPS: ~20/s peak; P3+: ~200/s peak.
5W · Where	Where does it run?	Fargate task in SG-1 (P0); multi-region active-active at P3+. TLS terminated at ALB; mTLS to per-module backends.
5W · Why	Why a gateway?	Because N agent clients × N modules = N² OAuth registrations otherwise. Federation collapses to N+1.
1H · How	How does it work?	Agent does OAuth 2.1 PKCE → gateway issues audience-bound token → agent calls `tools/list` → gateway aggregates from federated servers → agent calls a tool → gateway validates audience + scope + annotation → forwards via mTLS gRPC to module → streams response back → audit row written.
2C · Cost	Cost?	Negligible — Fargate task ~$30/month at P0. The gateway adds ~3 ms per tool call; the audit write is the main cost.
2C · Constraints	Constraints?	(a) MCP 2025-11-25 spec compliance. (b) PKCE-only (no implicit grant). (c) Destructive tools MUST require human-confirm. (d) Persona-version MUST be stamped in audit ((FR pending)). (e) Tool names MUST follow SEP-986 verbNoun.dotted.
5M · Materials	Stack?	Rust 1.81 · axum 0.7 · rmcp (Rust MCP SDK) · tonic (gRPC to per-module servers) · OAuth via AUTH service · OpenTelemetry · serde_json for spec serialisation.
5M · Methods	Method choices?	Streamable HTTP transport (chunked transfer). SSE for server→client events. Tasks primitive over polling endpoint. Elicitation as inline request/response interrupt. Tool registry as DB-backed catalogue + in-memory cache.
5M · Machines	Deployment?	Fargate (2 CPU · 4 GB); behind ALB with WAF. Per-module backends are also Fargate, addressed by Cloud Map service-discovery.
5M · Manpower	Who maintains?	0.3 FTE CTO + 0.2 FTE CSO at P0. Each module owner extends the tool catalogue for their module.
5M · Measurement	How measured?	N(FR pending) (read tool p95 ≤ 500 ms) + N(FR pending) (write tool p95 ≤ 1 s). Spec compliance via the MCP conformance test suite. Audit completeness at 100%.

Architecture

Three layers: an edge that speaks the MCP spec to agents, a federation router that fans tool calls out to per-module gRPC backends, and an audit + observability bridge. The 22 modules each run a per-server (e.g. cyberos.brain) that registers itself with the gateway at startup.

graph TB subgraph CLIENTS ["Agents (MCP 2025-11-25 clients)"] CL_D["Claude Desktop"] CL_C["Claude Code CLI"] CUR["Cursor / Cline / Codex"] CUO["🎯 CUO (internal)"] end subgraph GATEWAY ["MCP Gateway (Rust axum + rmcp)"] WK["well_known.rs
/.well-known/mcp
/.well-known/oauth-protected-resource (PRM)"] AUTH_E["oauth_edge.rs
OAuth 2.1 PKCE handshake
(delegates to AUTH service)"] DISC["discovery.rs
tools/list aggregation"] ROUT["federation.rs
tool name → backend server"] ANN["annotations.rs
destructive · readOnly · idempotent"] GATE["gate.rs
RBAC + human-confirm + audience check"] TASKS["tasks.rs
long-running primitive"] ELI["elicitation.rs
mid-execution prompt proxy"] AUD["audit.rs
per-call BRAIN write"] end subgraph BACKENDS ["Per-module MCP servers (mTLS gRPC)"] BR["cyberos.brain
put_memory · view · search · …"] SK["cyberos.skill
invoke_skill · list_skills · …"] AU["cyberos.auth
whoami · check_permission · …"] AI["cyberos.ai
complete · embed · usage_mtd"] CRM["cyberos.crm
(planned · 16 more modules)"] end subgraph SINKS AUTHSVC["🔐 AUTH service
token issuance + RBAC.Check"] BRAIN["🧠 BRAIN
mcp.invocation rows"] OBS["👁 OBS
traces + metrics"] end CL_D --> WK CL_C --> WK CUR --> WK CUO --> WK WK --> AUTH_E AUTH_E --> AUTHSVC WK --> DISC DISC --> ROUT CL_D -->|tools/call| ROUT CUR -->|tools/call| ROUT ROUT --> ANN ANN --> GATE GATE --> AUTHSVC GATE --> BR GATE --> SK GATE --> AU GATE --> AI GATE --> CRM ROUT --> TASKS GATE --> ELI GATE --> AUD AUD --> BRAIN GATEWAY --> OBS classDef planned fill:#e0f2fe,stroke:#0369a1 classDef backend fill:#cffafe,stroke:#0891b2 classDef sink fill:#f5ede6,stroke:#45210e class WK,AUTH_E,DISC,ROUT,ANN,GATE,TASKS,ELI,AUD,CL_D,CL_C,CUR,CUO planned class BR,SK,AU,AI,CRM backend class AUTHSVC,BRAIN,OBS sink

Internal components

Component	Path (planned)	Responsibility
`well_known.rs`	services/mcp-gateway/src/well_known.rs	Serves `/.well-known/mcp` discovery document + `/.well-known/oauth-protected-resource` (PRM, RFC 9728).
`oauth_edge.rs`	services/mcp-gateway/src/oauth_edge.rs	Handles OAuth 2.1 + PKCE handshake with the agent client. Delegates issuance to AUTH service over gRPC.
`discovery.rs`	services/mcp-gateway/src/discovery.rs	Aggregates `tools/list` across federated backends. Caches catalogue with 60 s TTL; invalidated on backend register.
`federation.rs`	services/mcp-gateway/src/federation.rs	Routes a tool call (`cyberos.brain.put_memory`) to the right backend server via mTLS gRPC. Caches resolved endpoints in-memory.
`annotations.rs`	services/mcp-gateway/src/annotations.rs	Parses tool annotations from backend manifest. Enforces destructive → human-confirm; readOnly → fast-path RBAC; idempotent → safe-retry; openWorld → strict scope check.
`gate.rs`	services/mcp-gateway/src/gate.rs	Composite gate — verifies audience claim, calls AUTH RBAC.Check, validates idempotency-key, applies rate-limit token bucket.
`tasks.rs`	services/mcp-gateway/src/tasks.rs	Tasks primitive ((FR pending)) — long-running tool invocations get a task_id; clients poll for status / result; results streamed via SSE.
`elicitation.rs`	services/mcp-gateway/src/elicitation.rs	Elicitation ((FR pending)) — mid-execution, a backend can request user input; gateway proxies that back to the agent client with content-safety filter.
`tool_registry.rs`	services/mcp-gateway/src/tool_registry.rs	Postgres-backed registry. Each backend registers its tools with annotations; collisions rejected at register time ((FR pending)).
`rate_limit.rs`	services/mcp-gateway/src/rate_limit.rs	Per-tool + per-tenant token-bucket rate limit; circuit-breaker on backend errors.
`idempotency.rs`	services/mcp-gateway/src/idempotency.rs	Replay safety via Idempotency-Key header ((FR pending)).
`audit.rs`	services/mcp-gateway/src/audit.rs	Emits one `mcp.invocation` row per call: agent · tool · args_hash · RBAC-decision · latency · outcome · persona-version.
`streamable_http.rs`	services/mcp-gateway/src/streamable_http.rs	RFC 9112 chunked-transfer HTTP transport per MCP 2025-11-25.
`conformance.rs`	services/mcp-gateway/tests/conformance.rs	MCP conformance test suite runner. Gating CI test ((FR pending)).

Data model

The gateway is mostly stateless. Postgres holds the tool registry, Redis caches tool catalogues and Tasks status, BRAIN absorbs the audit rows.

erDiagram SERVER ||--o{ TOOL_DEFINITION : "publishes" TOOL_DEFINITION ||--o{ ANNOTATION : "has" TOOL_DEFINITION ||--o{ TOOL_INVOCATION : "fulfils" TOOL_DEFINITION ||--o{ ELICITATION : "may trigger" TOOL_INVOCATION ||--o{ TASK : "may spawn" TASK ||--o{ TASK_STATUS_UPDATE : "emits" AGENT_CLIENT ||--o{ TOOL_INVOCATION : "calls" AGENT_CLIENT ||--o| OAUTH_REGISTRATION : "registered as" SERVER { string id PK "cyberos.brain" string display_name string endpoint "grpc://brain.internal:8081" string version string status "active | draining | offline" timestamp registered_at } TOOL_DEFINITION { string name PK "cyberos.brain.put_memory" string server_id FK string description obj input_schema "JSON Schema" obj output_schema string scope_required "brain.put" } ANNOTATION { string tool_name FK string key "destructive | readOnly | idempotent | openWorld | titleHuman | requiresConfirm" string value } AGENT_CLIENT { uuid id PK string display_name "Claude Desktop · Cursor · …" string redirect_uri string client_type "public (PKCE) | confidential" } OAUTH_REGISTRATION { uuid client_id FK string scopes timestamp registered_at } TOOL_INVOCATION { uuid id PK string tool_name FK uuid agent_id FK uuid subject_id "on_behalf_of" string persona_version string args_hash "SHA-256(canonical)" string idempotency_key string outcome "ok | denied | error | task_created" int latency_ms string brain_chain timestamp ts } TASK { uuid id PK uuid invocation_id FK string status "queued | running | completed | failed | cancelled" obj progress "fraction · stage · message" obj result timestamp created_at timestamp completed_at } TASK_STATUS_UPDATE { uuid id PK uuid task_id FK string status obj progress timestamp ts } ELICITATION { uuid id PK uuid invocation_id FK string prompt obj response_schema obj user_response timestamp asked_at timestamp answered_at }

Tool naming convention (SEP-986)

All tool names follow cyberos.{module}.{verb}_{noun}. Verbs match the canonical six (put · view · move · delete) for memory-shaped resources; other modules pick verbs from a shared catalogue.

Example tool name	Verb class	Annotations
`cyberos.brain.put_memory`	write	destructive=false · idempotent=true · scope=brain.put
`cyberos.brain.view_memory`	read	readOnly=true · idempotent=true · scope=brain.read
`cyberos.brain.delete_memory`	delete	destructive=true · requiresConfirm=true · scope=brain.delete
`cyberos.brain.search_memory`	query	readOnly=true · scope=brain.read
`cyberos.skill.invoke_skill`	execute	destructive=true · requiresConfirm=conditional · scope=skill.invoke
`cyberos.skill.list_skills`	read	readOnly=true · scope=skill.read
`cyberos.auth.check_permission`	query	readOnly=true · scope=auth.read
`cyberos.auth.revoke_session`	destructive	destructive=true · requiresConfirm=true · scope=auth.session_revoke
`cyberos.ai.complete_chat`	execute	destructive=false · openWorld=true · scope=ai.invoke
`cyberos.crm.create_account`	write	destructive=false · idempotent=false · scope=crm.write

API surface

The gateway speaks MCP 2025-11-25 to agents and gRPC to backends. A small admin REST surface lets operators inspect the registry and replay invocations.

MCP surface (canonical)

Method	MCP primitive	Purpose
GET	`/.well-known/mcp`	MCP server discovery doc per spec 2025-11-25.
GET	`/.well-known/oauth-protected-resource`	PRM (RFC 9728) — auth-server URL, scopes, audience.
POST	`/mcp`	Single streamable-HTTP endpoint for all MCP JSON-RPC calls.
JSON-RPC	`initialize`	Client capability negotiation.
JSON-RPC	`tools/list`	Discovery — returns federated tool catalogue.
JSON-RPC	`tools/call`	Invoke a tool with arguments.
JSON-RPC	`resources/list`	List MCP resources (files, URIs).
JSON-RPC	`resources/read`	Read a resource.
JSON-RPC	`prompts/list`	List prompt templates.
JSON-RPC	`prompts/get`	Materialise a prompt template.
JSON-RPC	`completion/complete`	Tool-arg autocomplete.
JSON-RPC	`tasks/get`	Poll long-running task status.
JSON-RPC	`tasks/cancel`	Cancel running task.
JSON-RPC	`elicitation/respond`	Reply to a server-initiated elicitation.
JSON-RPC	`sampling/createMessage`	Server-initiated LLM sampling (rate-limited, (FR pending)).
JSON-RPC	`logging/setLevel`	Server log-level config.
JSON-RPC	`roots/list`	List filesystem roots.
JSON-RPC	`notifications/initialized`	Capability acknowledgement.
JSON-RPC	`notifications/progress`	Streaming progress events.

Backend gRPC surface (per-module servers)

syntax = "proto3";
package cyberos.mcp.backend.v1;

service ModuleMCPServer {
  // Register at gateway startup. Sends manifest of all tools.
  rpc Register(RegisterRequest) returns (RegisterResponse);

  // Tool invocation forwarded from gateway. Streamable response.
  rpc InvokeTool(stream ToolCall) returns (stream ToolResult);

  // For long-running tools the backend returns a task handle.
  rpc TaskStatus(TaskRef) returns (TaskState);

  // Backend may initiate elicitation back through gateway.
  rpc Elicit(ElicitationRequest) returns (ElicitationResponse);

  // Health + drain.
  rpc Health(Empty) returns (HealthResponse);
}

message RegisterRequest {
  string server_id = 1;
  string version = 2;
  repeated ToolDefinition tools = 3;
}

message ToolDefinition {
  string name = 1;                  // "cyberos.brain.put_memory"
  string description = 2;
  string input_schema = 3;          // JSON Schema
  string output_schema = 4;
  string scope_required = 5;
  map<string, string> annotations = 6;
}

message ToolCall {
  string tool_name = 1;
  string args_json = 2;
  string idempotency_key = 3;
  string subject_jwt = 4;
  string persona_version = 5;
  string trace_id = 6;
}

Admin REST surface (operator-only)

Method	Path	Purpose
GET	`/admin/servers`	List registered backends + their status.
GET	`/admin/tools`	Full tool catalogue with annotations.
POST	`/admin/tools/{name}/disable`	Soft-disable a tool (server returns "method not allowed").
GET	`/admin/invocations`	Recent invocations (filterable by tool, agent, subject).
POST	`/admin/invocations/{id}/replay`	Replay an invocation in dry-run mode (read-only tools only).
GET	`/admin/tasks`	List active tasks.
POST	`/admin/tasks/{id}/cancel`	Force-cancel a stuck task.

Key flows

Flow 1 — Well-known discovery + tool listing

sequenceDiagram autonumber participant A as Agent (Claude Desktop) participant G as MCP Gateway participant AS as AUTH service participant BR as cyberos.brain (backend) participant SK as cyberos.skill participant Cache as Catalogue cache A->>G: GET /.well-known/mcp G-->>A: {auth_url, prm_url, transports:["streamable-http"], capabilities:{…}} A->>G: GET /.well-known/oauth-protected-resource G-->>A: {auth_server, scopes, audience:"mcp.cyberos.com"} A->>AS: OAuth 2.1 PKCE auth code flow AS-->>A: {access_token aud=mcp.cyberos.com} A->>G: POST /mcp {jsonrpc:"2.0", method:"initialize"} G-->>A: {capabilities, server_info} A->>G: POST /mcp {method:"tools/list"} G->>Cache: get catalogue alt cache hit Cache-->>G: full tool list else cache miss G->>BR: ListTools() G->>SK: ListTools() BR-->>G: 12 tools SK-->>G: 8 tools G->>Cache: SET catalogue TTL=60s end G-->>A: {tools: [cyberos.brain.put_memory, cyberos.skill.invoke_skill, …]}

The agent sees one server; under the hood, 22 modules contribute their tools. Caching keeps catalogue assembly < 5 ms p95.

Flow 2 — Tool invocation with OAuth + RBAC

sequenceDiagram autonumber participant A as Agent participant G as MCP Gateway participant GA as gate.rs participant AS as AUTH RBAC.Check participant BR as cyberos.brain participant AUD as audit.rs participant B as 🧠 BRAIN A->>G: POST /mcp {method:"tools/call", name:"cyberos.brain.put_memory",
arguments:{path:"…", body:"…"}, idempotency_key:"…"} G->>GA: verify audience claim (mcp.cyberos.com) G->>GA: load annotations(destructive=false, scope_required=brain.put) GA->>AS: RBAC.Check(subject_jwt, action="brain.put", resource="memories/…") AS-->>GA: {allow:true, reason:"role.founder + scope.brain.put"} GA->>BR: InvokeTool(args_json, persona_version, trace_id) BR-->>GA: stream {chunks, done:{seq, chain, body_hash}} GA-->>A: {content:[{type:"text", text:"…"}], isError:false} GA->>AUD: write mcp.invocation AUD->>B: append row {tool, agent, subject, persona_version, outcome:"ok"}

Per-call latency: ~3 ms gateway overhead + backend processing time. Audit write is fire-and-forget; backlog > 60 s triggers an alert.

Flow 3 — Destructive tool with human-confirm gating

sequenceDiagram autonumber participant A as Agent participant G as MCP Gateway participant ANN as annotations.rs participant ELI as elicitation.rs participant U as User UI (CHAT or IDE) participant SK as cyberos.skill A->>G: tools/call {name:"cyberos.skill.invoke_skill",
arguments:{skill_id:"vn-vat-invoice"}} G->>ANN: load annotations ANN-->>G: destructive=true, requiresConfirm=true G->>ELI: needs user confirmation ELI->>U: "Skill 'vn-vat-invoice' will create an invoice in BRAIN. Proceed?" alt user approves U-->>ELI: {approved:true} ELI-->>G: proceed G->>SK: InvokeTool(...) SK-->>G: result G-->>A: result else user denies U-->>ELI: {approved:false} ELI-->>G: denied G-->>A: {error:"user_denied", isError:true} end

(FR pending): destructive tool calls without confirmation are rejected. The elicitation UI is rendered by the agent client (Claude Desktop, Cursor) — the gateway is content-agnostic.

Flow 4 — Long-running task via Tasks primitive

sequenceDiagram autonumber participant A as Agent participant G as MCP Gateway participant T as tasks.rs participant BG as cyberos.kb (backend) A->>G: tools/call {name:"cyberos.kb.reindex_corpus"} G->>BG: InvokeTool(...) BG-->>G: {task_id:"tsk_01HZJ…XK", expected_ms:120000} G->>T: register task tsk_01HZJ…XK G-->>A: {task_id, status:"running"} Note over A: agent receives task handle; returns control to user loop poll every 2 s A->>G: tasks/get {id:"tsk_…"} G->>BG: TaskStatus(task_id) BG-->>G: {status:"running", progress:0.34, message:"embedding 3,400 / 10k docs"} G-->>A: {status:"running", progress:0.34, message:"…"} end BG-->>G: {status:"completed", result:{indexed:10000, elapsed_s:118}} A->>G: tasks/get G-->>A: {status:"completed", result:{…}}

(FR pending): Tasks primitive for long-running tool invocations. The agent can release the call and resume the conversation; polling fetches updates.

Flow 5 — Mid-execution elicitation

sequenceDiagram autonumber participant A as Agent participant G as MCP Gateway participant SK as cyberos.skill (vn-bank-transfer) participant ELI as elicitation.rs participant U as User UI A->>G: tools/call cyberos.skill.invoke_skill {skill:"vn-bank-transfer", amount:25000000} G->>SK: InvokeTool(...) SK->>G: elicitation/request {prompt:"Confirm transfer of ₫25,000,000 to ACME — please type CONFIRM"} G->>ELI: proxy elicitation (with content-safety filter) ELI->>U: render prompt in client UI U->>ELI: types "CONFIRM" ELI-->>G: response {text:"CONFIRM"} G-->>SK: continue invocation SK->>SK: execute transfer SK-->>G: result {napas_tx_id:"…"} G-->>A: result

(FR pending): elicitation proxied with content-safety filter — prevents a compromised backend from injecting prompts that exfiltrate via the user.

Tool-call lifecycle

A single tool invocation traverses up to nine states. Most calls run synchronously and reach Completed in < 1 s; long-running ones go via the Tasks primitive.

stateDiagram-v2 [*] --> Received: agent → POST /mcp tools/call Received --> AudienceOK: aud claim matches gateway AudienceOK --> AnnotationLoaded: tool registry hit AnnotationLoaded --> RBACChecked: AUTH RBAC.Check AnnotationLoaded --> Rejected: tool disabled / unknown RBACChecked --> Confirming: destructive + requiresConfirm RBACChecked --> Forwarding: readOnly OR confirmed Confirming --> Forwarding: user approves Confirming --> Denied: user denies Forwarding --> Streaming: backend producing chunks Forwarding --> TaskCreated: long-running, task_id issued TaskCreated --> Polled: agent polls tasks/get Polled --> Polled: status=running Polled --> Completed: status=completed Polled --> Failed: status=failed Streaming --> Completed: done frame Streaming --> Failed: backend error Streaming --> Cancelled: agent / user cancels Completed --> Audited: mcp.invocation row written Failed --> Audited Cancelled --> Audited Rejected --> Audited Denied --> Audited Audited --> [*]

Functional Requirements

The CyberOS FR catalogue is being rebuilt one feature at a time via the open fr-author Agent Skill.

Previous FR enumerations were archived 2026-05-14 and are no longer reflected on this page. PRD/SRS narrative remains authoritative for the spec; specific FRs land here as they are re-authored.

Non-Functional Requirements

Performance + security NFRs that flow through MCP Gateway. Cross-referenced at nfr-catalog.html#mcp.

NFR ID	Concern	Target	Measurement
`N(FR pending)`	MCP read tool p95	≤ 500 ms	k6 load test
`N(FR pending)`	MCP write tool p95	≤ 1 s	k6 load test
`N(FR pending)`	Gateway overhead per call	≤ 5 ms p95	internal bench
`N(FR pending)`	tools/list discovery	≤ 100 ms p95 (cache hit)	internal bench
`N(FR pending)`	Gateway availability (28-day)	≥ 99.95%	SLO monitor
`N(FR pending)`	MCP spec conformance	100% of conformance suite	CI gate
`N(FR pending)`	Audit completeness	100% (no dropped invocations)	chaos test + BRAIN walk
`N(FR pending)`	Destructive without confirm	= 0 events	CI regression + runtime check
`N(FR pending)`	Tool name violations	= 0 (rejected at register)	registry validator
`N(FR pending)`	Idempotency-key collision rate	= 0 false positives	property-based test
`N(FR pending)`	Task status update latency	≤ 2 s after backend update	bench/tasks
`N(FR pending)`	Per-tool rate-limit enforcement	100% (no overflow)	load test
`N(FR pending)`	OAuth 2.1 conformance	OAuth 2.1 + RFC 9728 PRM	Conformance suite

Dependencies

graph LR subgraph upstream ["MCP Gateway depends on"] AUTH["🔐 AUTH
token + RBAC.Check"] BRAIN["🧠 BRAIN
mcp.invocation rows"] OBS["👁 OBS
traces + metrics"] PG["🗄 PostgreSQL
tool registry"] REDIS["⚡ Redis
catalogue + tasks cache"] end MCP["🔌 MCP Gateway"] subgraph backends ["Per-module servers"] BR["cyberos.brain"] SK["cyberos.skill"] AU["cyberos.auth"] AI["cyberos.ai"] OTH["+18 more"] end subgraph clients ["MCP clients"] CL["Claude Desktop"] CC["Claude Code"] CUR["Cursor / Cline / Codex"] INT["Internal agents"] end AUTH --> MCP BRAIN --> MCP OBS --> MCP PG --> MCP REDIS --> MCP MCP --> BR MCP --> SK MCP --> AU MCP --> AI MCP --> OTH CL --> MCP CC --> MCP CUR --> MCP INT --> MCP classDef shipped fill:#f5ede6,stroke:#45210e classDef planned fill:#fef6e0,stroke:#9c750a class BRAIN,SK,REDIS,PG shipped class MCP,AUTH,AI,AU,BR,OTH,OBS planned class CL,CC,CUR,INT shipped

Compliance scope

Regulation / standard	Article / clause	MCP Gateway feature
EU AI Act	Art. 12 — Logging	One `mcp.invocation` row per tool call.
EU AI Act	Art. 14 — Human oversight	Destructive tools require human-confirm via elicitation.
EU AI Act	Art. 26 — Deployer obligations	Persona-version stamping ((FR pending)).
Vietnam PDPL	Art. 14 — DSAR	Per-subject mcp.invocation export.
GDPR	Art. 25 — Privacy by design	Audience-bound tokens; confused-deputy mitigation.
GDPR	Art. 32 — Security of processing	OAuth 2.1 PKCE only; mTLS to backends.
OWASP Gen AI Top-10	LLM02: Insecure output handling	Elicitation content-safety filter.
OWASP Gen AI Top-10	LLM07: Insecure plugin design	Tool annotations + RBAC gate; closed catalogue.
OWASP Gen AI Top-10	LLM08: Excessive agency	Audience-bound token + persona scope; can't escalate.
RFC 9728	OAuth 2.0 Protected Resource Metadata	PRM at `/.well-known/oauth-protected-resource`.
RFC 7636	PKCE	PKCE-only — implicit grant disabled.
SOC 2 Type II	CC6.7 — Restriction of system access	Tool registry as closed catalogue; new tool registration requires CTO approval.

Risk entries

ID	Risk	Likelihood	Impact	Owner	Mitigation
`R-MCP-001`	Spec drift — agent client expects newer features than gateway provides	Medium	Medium	CTO	Quarterly spec sync; CI gates on conformance suite; capability negotiation surfaces gaps.
`R-MCP-002`	Tool-name squatting — module registers a tool name belonging to another	Low	High	CTO	Server_id is part of registry key; cross-module names rejected at register ((FR pending)).
`R-MCP-003`	Confused-deputy attack — token from one audience used against another module	Medium	High	CSO	Audience claim mandatory + verified at backend; mTLS pins to right service.
`R-MCP-004`	Destructive tool annotation bypass via newly-added tool	Medium	High	CSO	Registration requires CTO + CSO approval; default annotation is destructive=true.
`R-MCP-005`	Tasks primitive leak — task_id reused across tenants	Low	High	CTO	task_id is UUIDv7 + tenant_id in lookup; cross-tenant query property-tested.
`R-MCP-006`	Elicitation injection — backend prompts user into harmful action	Medium	High	CSO	Content-safety filter; allow-list of elicitation prompts per tool; CHAT log of all elicitations.
`R-MCP-007`	Backend slow → caller times out → orphaned audit row	Medium	Low	CTO	Per-tool timeout enforced; audit row written on cancellation; reconciliation job catches orphans.
`R-MCP-008`	OAuth registration scaling — every agent client must register	Low	Low	CTO	Dynamic Client Registration (DCR) per RFC 7591; tenant admin approves before activation.
`R-MCP-009`	Tool catalogue cache staleness after backend deploy	Medium	Low	CTO	60 s TTL + register-time invalidation; deploys trigger cache reset.
`R-MCP-010`	Sampling-with-Tools amplification (LLM-driven loop)	Medium	Medium	CSO	(FR pending): per-session sampling rate limit; circuit-breaker on recursion depth.

KPIs

KPI	Formula	Source	Target
Conformance pass rate	`passed / total`	MCP conformance suite	100%
Read-tool p95 latency	histogram	OBS	≤ 500 ms
Write-tool p95 latency	histogram	OBS	≤ 1 s
Gateway overhead p95	histogram	OBS	≤ 5 ms
Audit completeness	`rows_in_brain / invocations`	chaos test	100%
Destructive-without-confirm	count / 28 d	mcp.invocation	= 0
Tool-name violations	count	registry validator	= 0
Active tasks	count of `status=running`	tasks DB	tracked; alert on backlog > 100
Backend health	healthy / registered	`/admin/servers`	≥ 95% always

RACI matrix

Activity	CEO	CTO	CSO	Module owners	CDO
Gateway design + implementation	A	R	C	I	I
Per-module backend implementation	I	C	C	A/R	I
Tool annotations review	I	C	A/R	R	I
Spec conformance	I	A/R	C	I	I
OAuth + audience design	I	C	A/R	I	I
Audit pipeline maintenance	I	R	C	I	A
Tool catalog curation	A	C	C	R	C

Planned CLI surface

Operator CLI cyberos-mcp + ad-hoc mcp-inspector for development.

1. List registered backends

$ cyberos-mcp servers list

SERVER_ID            STATUS    VERSION   TOOLS    LAST_SEEN
cyberos.brain        active    2.0.0     12       2026-05-14T07:21Z
cyberos.skill        active    1.4.0     8        2026-05-14T07:21Z
cyberos.auth         active    0.9.0     7        2026-05-14T07:20Z
cyberos.ai           active    0.8.0     5        2026-05-14T07:21Z
cyberos.crm          draining  0.3.0     11       2026-05-14T07:15Z

2. Inspect a tool

$ cyberos-mcp tool inspect cyberos.brain.delete_memory

name:           cyberos.brain.delete_memory
server:         cyberos.brain
description:    "Delete a memory file (tombstone by default; purge requires DSAR reason)"
scope_required: brain.delete
annotations:
  destructive:     true
  readOnly:        false
  idempotent:      false
  requiresConfirm: true
  openWorld:       false
input_schema:   {path: string, mode: "tombstone"|"purge", reason: string?}

3. Replay an invocation (dry-run, read-only)

$ cyberos-mcp invocations replay --id inv_01HZJ…XK --dry-run

[replay] invocation inv_01HZJ…XK
  tool:       cyberos.brain.search_memory
  agent:      Claude Desktop (claude-desktop-v1.0.45)
  args:       {query: "Singapore HoldCo"}
  original_outcome: ok (12 hits)
  replay_outcome:   ok (12 hits)  — identical
[replay] read-only tool only · audit row NOT written

4. Cancel a stuck task

$ cyberos-mcp tasks cancel --id tsk_01HZJ…XK --reason "client disconnected"

[cancel]  tsk_01HZJ…XK · status: running → cancelled
[backend] cyberos.kb signalled to abort
[audit]   mcp.invocation updated · outcome=cancelled

5. Conformance test

$ cyberos-mcp conformance run

[conformance] MCP 2025-11-25 spec suite · 142 tests
  initialize             ✓ (12 / 12)
  tools/list             ✓ (18 / 18)
  tools/call             ✓ (24 / 24)
  resources/*            ✓ (16 / 16)
  prompts/*              ✓ (14 / 14)
  completion/complete    ✓ (8 / 8)
  tasks/*                ✓ (12 / 12)
  elicitation/*          ✓ (10 / 10)
  sampling/*             ✓ (6 / 6)
  oauth (pkce)           ✓ (14 / 14)
  prm                    ✓ (8 / 8)
[result]   142 / 142 PASSED  · 0 failed · 0 skipped

6. Register a new backend (CTO + CSO approval gate)

$ cyberos-mcp servers register \
    --id cyberos.crm \
    --endpoint grpc://crm.internal:8081 \
    --version 0.3.0 \
    --manifest crm-tools.json \
    --approval-jira CYB-1234

[validate]  manifest: 11 tools · SEP-986 ✓ · annotations ✓
[approve]   CTO ✓  CSO ✓ (per CYB-1234)
[register]  cyberos.crm registered
[catalog]   cache invalidated
[audit]     brain seq=14856

Phase status & estimates

Status

Planned

P0 · design phase · M+2

Est. LoC (Rust)

~5,500

services/mcp-gateway

Planned tests

142 conformance + 60 unit

MCP spec suite

P0 tools (est.)

~80

BRAIN · Skill · AUTH · AI

P3 tools (est.)

~300+

22 modules × ~15 verbs

CLI commands

~18 planned

cyberos-mcp

Capability	Status
Streamable HTTP transport (RFC 9112)	planned · P0
OAuth 2.1 + PKCE edge handshake	planned · P0
Well-known discovery + PRM	planned · P0
Tool registry (SEP-986 enforced)	planned · P0
tools/list federation	planned · P0
tools/call gating (annotations + RBAC)	planned · P0
Tasks primitive	planned · P0
Elicitation back-channel	planned · P0
Idempotency-Key replay safety	planned · P0
Per-tool rate limit + circuit breaker	planned · P0
mcp.invocation audit row	planned · P0
Conformance suite (142 tests)	planned · P0
Sampling-with-Tools rate limiting	planned · P1
Dynamic Client Registration (RFC 7591)	planned · P1
Multi-region active-active	planned · P3+

References

PRD §8.4 — MCP Gateway architecture, PRM flow, tasks primitive.
PRD §9.8 — (FR pending) through (FR pending) (PRD-tier).
SRS §4.8 — Formal (FR pending) catalogue with verification methods.
MCP Specification 2025-11-25 — modelcontextprotocol.io/specification/2025-11-25.
SEP-986 — verbNoun.dotted tool naming convention.
RFC 6749 / draft-ietf-oauth-v2-1 — OAuth 2.1.
RFC 7591 — Dynamic Client Registration.
RFC 7636 — Proof Key for Code Exchange (PKCE).
RFC 9112 — HTTP/1.1 (chunked transfer).
RFC 9728 — OAuth 2.0 Protected Resource Metadata.
OWASP Gen AI Top-10 (2025) — LLM02, LLM07, LLM08 mitigations.
Architecture context: infrastructure.html#mcp.