🎯

CUO

P0 · Phase 1 shipped Phase 1 shipped · Phases 2–4 planned Owner: Stephen Cheng (CEO)

The Chief Universal Officer — natural language in, skill chain out, memory record on file.

CUO is the agentic orchestrator above the BRAIN and the Skill catalog. The user sees one persona (Genie, the mascot); inside the router, ten C-level sub-skills load on demand via the open Agent Skills format. CUO parses a query, scores every skill in the catalog, dispatches the winner through the Skill module, and records the routing decision in the BRAIN audit chain so the orchestrator's behaviour is replayable and reviewable. Phase 1 ships a deterministic rule-based router (200 lines · 30 pytest tests · 15 fixtures). Phase 2 adds an LLM-driven router behind a confidence cascade; Phase 3 enables multi-skill chains via topological dependency walking; Phase 4 splits the keyword bank by C-level persona. The BCP-14 protocol is at cuo/docs/AGENTS.md; the implementation is at cuo/cuo/core/.

CUO is one persona — Genie — backed by ten C-level specialists that load on demand. It is deliberately minimal: parse the query, score candidates, pick one, invoke through the Skill host, write the decision row to BRAIN. The router is rule-based today (deterministic, sub-millisecond) and will layer in an LLM cascade at Phase 2 (the trade is latency for ambiguous queries). Every decision is itself an audit-chained memory, so every routing choice CUO ever made is replayable from disk.

Status
Phase 1 shipped
Phases 2–4 planned
LoC (core)
~800
Python · 6 modules
Tests
15 pytest + 15 routing fixtures
deterministic golden tests
CLI subcommands
3
catalog · route · skills
Personas
10 + 4 emerging
CEO · COO · CFO · CMO · …
Confidence threshold
≥ 0.30
below → defer-to-human
Depends on
BRAIN · SKILL · AI · MCP · AUTH
consumer of all five
Used by
Every user-facing module
CHAT · EMAIL · everything UX
1

Why CUO exists

Internal operations is full of obvious requests that should resolve in one round trip: "validate this MST", "generate a VAT invoice for ACME", "draft a 1:1 prep doc for Thursday with Hanh". Without an orchestrator, every such request goes through a 200ms LLM call, every catalog lookup costs another 200ms, and every audit fact has to be re-implemented in the calling skill. CUO is the single layer that absorbs that pattern: rule-based fast path for unambiguous queries, LLM cascade for the ambiguous tail, BRAIN-anchored audit for every decision.

Latency budget

90% of queries are obvious. Rule-based decisions resolve in < 1 ms. Only the ambiguous 10% need a Phase-2 LLM call.

📜
Replay & audit

Same query + same catalog → same decision (Phase 1). LLM Phase 2 logs full prompt + model + temperature so the audit row is still replay-sufficient.

🧭
Defer-to-human

Below the confidence threshold, CUO refuses to invoke. It surfaces the top three candidates and asks the user to choose — a hard EU AI Act Art. 26 oversight guarantee.

CUO is also the entry-point that lets the company hire one persona — Genie — and gradually grow ten specialist skills behind it without changing the user-facing interface. The same Slack thread / chat box that asked Genie to "validate MST 0123456789" yesterday can ask it to "draft the Q3 OKR cascade" tomorrow, and the right C-level skill will load.

2

What it does — 5W1H2C5M

AxisQuestionAnswer
5W · WhatWhat is CUO?An agentic orchestrator. Parses NL → scores catalog skills → invokes top match → records decision. The orchestrator state is per-request; CUO itself is stateless between requests.
5W · WhoWho interacts?Users: every CyberSkill member via the Genie chat box. Agents: external Claude / Codex / Cursor sessions that hand off to CUO when they hit a CyberOS surface. Owner: CEO seat today; CXO seat at P3+.
5W · WhenWhen is it invoked?On every NL request to the platform that is not a direct CRUD on a known entity. Phase 1 is request-scoped; Phase 3 introduces multi-step chains that re-enter routing for each step.
5W · WhereWhere does it run?Co-located with the user (Tauri / CLI / IDE). LLM cascade calls the AI Gateway (LiteLLM). All decisions land in the same BRAIN as the user's other memories.
5W · WhyWhy this design?Because the alternative is asking the user to know which skill to call, or shipping an enormous monolithic prompt that re-derives the catalog every turn. CUO is the cheapest layer that gives Genie a real, replayable decision policy.
1H · HowHow does it route?Score per candidate = 5.0 if skill_name in query else 0 + 3.0 × keyword_hits + 2.0 if VN-diacritic AND skill.region=VN else 0. Top scorer wins if score > 3.0 (confidence ≥ 0.30); otherwise emit routed:false.
2C · CostCost per decision?Phase 1 ≈ 0.4 ms per route (in-process). Phase 2 LLM cascade: only when ≤ 0.50 confidence AND ≥ 0.10 candidate exists; typical 150 ms. Memory: 1 row per decision in BRAIN, ~700 bytes.
2C · ConstraintsConstraints?(a) Phase 1 MUST be deterministic — same query + same catalog → same decision. (b) Below threshold MUST defer-to-human; no auto-invoke. (c) Skill capabilities MUST be respected via Skill host's broker; CUO cannot bypass.
5M · MaterialsWhat does it use?The Skill catalog (read from disk in sorted-path order), a per-skill keyword bank in cuo/core/router.py, optional BRAIN context for Phase 2 LLM cascade, and the AI Gateway for inference.
5M · MethodsMethod choices?Rule-based scoring (Phase 1), LLM cascade (Phase 2 LangGraph + LiteLLM), topological chain walking (Phase 3), per-persona keyword bank (Phase 4). Each layered on top of the previous, none replacing it.
5M · MachinesWhere does it run?Locally with the user (Tauri host) for the rule-based path; AI Gateway for LLM calls. Postgres checkpointer for LangGraph state (Phase 2+) — required for EU AI Act Art. 12 logging.
5M · ManpowerWho maintains?1 IC owner today. By P1 exit the CPO and CTO co-own the keyword bank + persona definitions. At P3 the dedicated CXO seat appears.
5M · MeasurementHow measured?Routing confidence distribution, escalation rate, decision latency, defer-to-human rate. KPIs in §13.
3

Architecture

Six modules in cuo/cuo/core/ form the entire surface. Catalog discovers skills off disk. Router scores. Invoker delegates to the Skill module. Memory-bridge writes the decision. Trace renders the structured row.

graph TB subgraph CLIENTS ["Clients"] USER["User · Tauri / CLI"] AGENT["External agent
(Claude / Codex / Cursor)"] end subgraph CUO ["CUO router (cuo/core/)"] PARSE["parse
NFC-normalise query"] CATALOG["catalog.py
read Skill manifests"] ROUTER["router.py
score & pick"] EXTRACT["arg extractors
(per-skill, pure)"] INVOKER["invoker.py
shell out to Skill"] TRACE["trace.py
structured row"] BRIDGE["memory_bridge.py
write to BRAIN"] end subgraph DOWNSTREAM ["Downstream"] SKILL["🛠 Skill host
(Rust cyberos-skill-cli)"] AI["⚡ AI Gateway
(Phase 2 LLM cascade)"] BRAIN["🧠 BRAIN
(audit chain)"] end USER --> PARSE AGENT --> PARSE PARSE --> CATALOG CATALOG --> ROUTER ROUTER --> EXTRACT ROUTER -. "ambiguous (≤ 0.50)" .-> AI AI -. "ranked pick" .-> ROUTER ROUTER --> INVOKER INVOKER --> SKILL SKILL -. "stdout / stderr / exit" .-> INVOKER INVOKER --> TRACE TRACE --> BRIDGE BRIDGE --> BRAIN classDef shipped fill:#f5ede6,stroke:#45210e classDef planned fill:#fef6e0,stroke:#9c750a class PARSE,CATALOG,ROUTER,EXTRACT,INVOKER,TRACE,BRIDGE,SKILL,BRAIN shipped class AI planned

Internal components

ComponentFileResponsibility
catalog.pycore/catalog.pyDiscover Skill manifests off disk in sorted-path order. Cached per-request.
router.pycore/router.pyPhase 1 rule-based scorer. Per-skill _KEYWORD_BANK + ARG_EXTRACTORS. Returns (decision, alternatives).
invoker.pycore/invoker.pyShell out to cyberos-skill-cli run <skill>. Capture stdout / stderr / exit-code. No in-process skill execution.
trace.pycore/trace.pyRender structured trace row: query, decision, alternatives, result, timestamps.
memory_bridge.pycore/memory_bridge.pyWrite trace row to BRAIN. Phase 1: flat file under meta/cuo-decisions/<ts_ns>.md. Phase 2: through canonical Writer.

Phase roadmap

PhaseWhat changesWhyStatus
Phase 1Rule-based scorer · pure-function extractors · flat-file BRAIN bridgeDeterminism, sub-ms latency, no external LLM depshipped
Phase 2LangGraph supervisor + LiteLLM router · Postgres checkpointer · escalate when 0.10 ≤ score ≤ 0.50Handle ambiguous tail · EU AI Act Art. 12 logging via Postgresplanned
Phase 3Multi-skill chains via depends_on · composite audit row + sub-rowsCompound workflows (MST validate → VAT invoice)planned
Phase 4Per-C-level keyword bank · persona-router-first → intra-persona routingCatalog scaling beyond ~50 skillsplanned
4

Data model

CUO owns three entities. A SkillEntry is the projection of a Skill manifest into the router's catalog. A RoutingDecision is the result of scoring. An InvocationResult captures what the Skill host actually did.

erDiagram CATALOG ||--o{ SKILL_ENTRY : "contains" REQUEST ||--|| ROUTING_DECISION : "produces" ROUTING_DECISION ||--o{ ALTERNATIVE_CANDIDATE : "ranks" ROUTING_DECISION ||--o| INVOCATION_RESULT : "invokes (optional)" INVOCATION_RESULT ||--|| TRACE_ROW : "emits" TRACE_ROW ||--|| BRAIN_AUDIT_ROW : "persisted as" ROUTING_DECISION ||--o{ CHAIN_STEP : "Phase 3: chained calls" CATALOG { string fingerprint PK "sha256 of catalog snapshot" int64 scanned_at_ns int skill_count } SKILL_ENTRY { string name PK string version string description string region "VN | global" string[] keywords string[] depends_on "Phase 3" string[] allowed_tools } REQUEST { string request_id PK string query string actor int64 ts_ns string persona "CEO|COO|...|null" } ROUTING_DECISION { string request_id FK string skill_name "null if routed=false" obj arguments float confidence "0.0–1.0" string rationale bool routed string router_phase "phase1 | phase2" } ALTERNATIVE_CANDIDATE { string request_id FK string skill_name float score int rank } INVOCATION_RESULT { string request_id FK int exit_code string stdout string stderr int64 started_at_ns int64 ended_at_ns } CHAIN_STEP { string request_id FK int step_index string skill_name obj arguments string status "ok | failed | skipped" } TRACE_ROW { string trace_id PK obj decision obj result obj chain "Phase 3: list of CHAIN_STEP" } BRAIN_AUDIT_ROW { int64 seq PK string path "meta/cuo-decisions/.md" string body_hash string chain }
5

API surface

GraphQL subgraph (planned · P0+)

extend schema
  @link(url: "https://specs.apollo.dev/federation/v2.5", import: ["@key"])

type RoutingDecision @key(fields: "requestId") {
  requestId: ID!
  query: String!
  actor: String!
  skillName: String                # null when routed=false
  arguments: JSON
  confidence: Float!
  rationale: String!
  alternatives: [Candidate!]!
  routed: Boolean!
  routerPhase: RouterPhase!
  invokedAt: DateTime
  result: InvocationResult         # null if --invoke not requested
}

type Candidate {
  skillName: String!
  score: Float!
  rank: Int!
}

type InvocationResult {
  exitCode: Int!
  stdout: String!
  stderr: String!
  startedAt: DateTime!
  endedAt: DateTime!
}

enum RouterPhase { phase1_rule  phase2_llm  phase3_chain  phase4_persona }

type Query {
  route(query: String!, persona: Persona): RoutingPreview!
  decision(requestId: ID!): RoutingDecision
  decisions(actor: String, since: DateTime, limit: Int = 50): [RoutingDecision!]!
}

type Mutation {
  routeAndInvoke(query: String!, record: Boolean = true): RoutingDecision!
  invokeSkill(skillName: String!, arguments: JSON!): InvocationResult!
}

type RoutingPreview {
  decision: RoutingDecision!
  catalogFingerprint: String!      # for replay
}

MCP tool catalogue

Tool nameInputsOutputsAnnotations
cuo.routequery, persona?RoutingDecisionreadonly · pure · scope=route
cuo.route_and_invokequery, record=trueRoutingDecision + resultdestructive=true · scope=invoke
cuo.catalogSkillEntry[]readonly · cached · scope=read
cuo.explainrequestIdrationale + alternativesreadonly · scope=audit

CLI — cyberos-cuo

SubcommandPurposeExample
cyberos-cuo catalogList skills CUO can route tocyberos-cuo catalog --format json
cyberos-cuo routeScore a query, optionally invoke + recordcyberos-cuo route "validate MST 0123456789" --invoke --record
cyberos-cuo skillsShow keyword bank + extractors per skillcyberos-cuo skills vn-mst-validate
6

Key flows

Flow 1 — Single-skill route decision (Phase 1)

sequenceDiagram autonumber participant U as User participant C as cuo.route() participant CAT as catalog.scan() participant R as router.score() participant E as extractor participant B as memory_bridge U->>C: "tạo hoá đơn cho ACME, MST 0123456789, số tiền 10 triệu" C->>C: NFC normalise · preserve diacritics C->>CAT: load catalog (sorted-path) CAT-->>C: [vn-mst-validate, vn-vat-invoice, …] C->>R: score each candidate R->>R: vn-vat-invoice: 3 keyword hits ×3.0 + VN region ×2.0 = 11.0 → saturate 10.0 → conf 1.0 R->>R: vn-mst-validate: 1 keyword hit ×3.0 + VN region ×2.0 = 5.0 → conf 0.5 R-->>C: decision={skill:vn-vat-invoice, conf:1.0, alts:[vn-mst-validate@0.5]} C->>E: extract args (vn-vat-invoice extractor) E-->>C: {mst:"0123456789", amount_vnd:10_000_000} alt --invoke C->>SkillHost: cyberos-skill-cli run vn-vat-invoice --args ... SkillHost-->>C: {exit:0, stdout:"<Invoice xmlns=...>"} end alt --record C->>B: write trace to BRAIN B-->>C: {seq:14935, chain:"e3f7..."} end C-->>U: {routed:true, decision, result?, recorded_at?}

Flow 2 — Multi-step chain (Phase 3 preview)

sequenceDiagram autonumber participant U as User participant C as cuo.route() participant R as router participant TOPO as chain_planner participant H as Skill host U->>C: "issue VAT invoice for ACME" C->>R: route(query) R-->>C: pick vn-vat-invoice (chain_root) C->>TOPO: walk depends_on TOPO-->>C: [vn-mst-validate (buyer), vn-mst-validate (seller), vn-vat-invoice] loop for each step C->>H: invoke step H-->>C: result alt step failed C-->>U: chain aborted at step N · reason · partial trace end end C->>BRAIN: composite audit row (root) + N sub-rows C-->>U: {chain_status:ok, steps:[...], total_elapsed_ms:N}

Flow 3 — Confidence cascade (Phase 1 → Phase 2)

flowchart TB Q[User query] --> P1[Phase 1 rule-based scorer] P1 --> SCORE{Top score?} SCORE -- "≥ 0.70 (auto)" --> INV[Auto-invoke top skill] SCORE -- "0.50–0.70 (ask)" --> CLAR[Ask clarification · surface top 3] SCORE -- "0.10–0.50 (escalate)" --> P2[Phase 2 LLM cascade] SCORE -- "< 0.10 (defer)" --> DEFER[Defer to human · no candidate] P2 --> P2SCORE{LLM confidence?} P2SCORE -- "≥ 0.70" --> INV P2SCORE -- "< 0.70" --> CLAR INV --> REC[Record decision + result in BRAIN] CLAR --> REC DEFER --> REC classDef shipped fill:#f5ede6,stroke:#45210e classDef planned fill:#fef6e0,stroke:#9c750a class P1,SCORE,INV,CLAR,DEFER,REC shipped class P2,P2SCORE planned

Phase 1 ships the ≥ 0.30 threshold; the four-tier cascade above lands at Phase 2 once the LLM router is online.

Flow 4 — Persona switch (Phase 4 preview)

sequenceDiagram autonumber participant U as User participant C as cuo (Phase 4) participant PR as persona_router participant CEO as CEO sub-router participant CTO as CTO sub-router U->>C: "draft Q4 OKRs for the eng team" C->>PR: classify_persona(query) PR-->>C: persona=CEO (strategic intent) C->>CEO: route within CEO skill subset CEO-->>C: pick okr-cascade-draft (conf 0.83) C->>BRAIN: audit row with persona-version stamp alt followup: "what's the test coverage on the auth module?" U->>C: new query, same session C->>PR: classify_persona PR-->>C: persona=CTO (technical intent) C->>CTO: route within CTO skill subset CTO-->>C: pick test-coverage-report (conf 0.91) end
7

Decision lifecycle

stateDiagram-v2 [*] --> Received: NL query arrives Received --> Routing: catalog loaded · scoring in progress Routing --> Picked: top candidate ≥ threshold Routing --> Deferred: no candidate ≥ threshold Routing --> Escalated: 0.10 ≤ score ≤ 0.50 (Phase 2) Escalated --> Picked: LLM picks Escalated --> Deferred: LLM also abstains Picked --> Invoking: --invoke true Picked --> Recorded: --invoke false · decision-only Invoking --> Succeeded: exit_code == 0 Invoking --> Failed: exit_code != 0 Succeeded --> Recorded: trace + result → BRAIN Failed --> Recorded: trace + error → BRAIN Deferred --> Recorded: routed:false row → BRAIN Recorded --> [*]
8

The 10 C-level personas

Each persona is a curated subset of the Skill catalog. Today the catalog lives under skill/skills/cuo/<persona>/ and is loaded uniformly; Phase 4 splits the keyword bank per-persona so the router classifies intent first, then routes intra-persona. The Auto OK column lists actions that may complete without explicit operator approval; the Defers column lists actions that always escape to the human.

🎯CEO · Vision & Strategy Stephen Cheng (Founder seat)

Strategy memos, OKR cascade reviews, board narrative, weekly state-of-business, runway / fundraising posture. Owner of vision and capital allocation.

Auto OK
draft strategy memosummarise OKR progressgenerate weekly state-of-businessprep board update
Defers to human
send memo to investorsflip Singapore HoldCochange cap-tableterminate executive

⚙️COO · Operations to be filled · P1+

Cycle status digests, blocker triage, cross-team coordination, weekly ops review, vendor performance. Owner of "did we ship".

Auto OK
status digest from PROJflag overdue taskssummarise cycle enddraft 1:1 prep
Defers to human
cancel a projectreassign ownerchange vendoroverride SLA

💰CFO · Finance & Runway to be filled · P2+

Cashflow position, AR/AP digests, burn alerts, payroll cycles, VAT/CIT compliance posture. Owner of "do we have runway".

Auto OK
cashflow snapshotAR aging reportdraft invoice (via INV)flag overdue receivable
Defers to human
send invoice to clientexecute wireapprove refundchange banking signatory

📣CMO · Marketing & Demand to be filled · P2+

Campaign briefs, content calendar, channel reports, brand voice consistency. Owner of "do prospects know us".

Auto OK
draft campaign briefsummarise channel performancepropose A/B testenforce brand voice
Defers to human
publish public-facing contentspend marketing budgetmake public statement

💻CTO / CIO · Technology & Information Systems co-owned: CEO + CTO seat

Tech-debt triage, security advisories digest, OBS metric review, dependency upgrades, architecture decision records. Owner of "is the platform safe and fast".

Auto OK
propose ADRdraft SRStest-coverage reportsummarise OBS digestflag CVE
Defers to human
deploy to productionrotate KMS keygrant new capabilitydisable security control

👥CHRO · People & Talent to be filled · P1+

1:1 prep, performance summaries, onboarding paths, role descriptions, retention signals, growth ladders. Owner of "do we have the right people".

Auto OK
draft 1:1 agendasummarise perf reviewsdraft job descriptiononboarding checklist
Defers to human
make offerterminateadjust comp bandconduct performance conversation

🧭CSO · Strategy P3+ emerging

Competitive intel, scenario modelling, M&A scanning, partnership feasibility, strategic option papers. Owner of "what are we doing next".

Auto OK
competitive scanscenario modelpartnership feasibility memooption paper draft
Defers to human
approach partnercommit to strategic shiftexecute M&A LOI

⚖️CLO / CCO · Legal & Compliance co-owned: CSO + CLO seat at P2+

Contract redline, NDA triage, GDPR/PDPL audits, DSAR triage, vendor terms review, policy authoring. Owner of "are we compliant".

Auto OK
contract redline drafttriage NDADSAR intakePDPL gap analysispolicy draft
Defers to human
sign contractexecute DPAfile regulator submissionapprove cross-border transfer

📊CDO · Data P2+ emerging

Data quality, lineage, residency reviews, schema governance, retention policy, BRAIN integrity oversight. Owner of "is our data trustworthy".

Auto OK
audit BRAIN doctordata quality reportretention policy reviewlineage diagram
Defers to human
purge datachange retention periodapprove cross-border exportdisable encryption

🚀CPO · Product P0 co-owned with CEO seat

PRD drafts, roadmap analysis, requirements discovery, user-research synthesis, FR catalogue maintenance. Owner of "are we building the right thing". Today's most-exercised persona — the FR-author / SRS-author / requirements-discovery skills all live here.

Auto OK
draft PRDFR catalog refreshrequirements discovery interviewuser-research synthesisroadmap update
Defers to human
ship featuremake customer-facing promisechange pricingcommit to roadmap publicly

Emerging sub-personas (2026 trajectory)

Four additional C-level seats are watched but not provisioned at P0. The router treats their skills as belonging to the closest existing persona until the seat is split out.

🤖 CAIO · AI Officer

Owns the AI Gateway, model selection, prompt governance, AI risk register. Currently rolled into CTO. Split-out at P2 when GA AI Act compliance becomes operational load.

🎨 CXO · Experience Officer

Owns CUO persona consistency, end-to-end member experience, ambient nudges. Today is a CEO concern; split-out at P3 when external tenants arrive.

💼 CRO · Revenue Officer

Owns pipeline, win rates, churn signals, expansion. Today rolled into CEO + CFO. Split-out at P4 when external SaaS begins selling.

🌱 CSO-Sustainability

Owns ESG metrics, climate reporting, vendor sustainability scoring. Watched for 2026 EU reporting evolution.

8

Functional Requirements

The CyberOS FR catalogue is being rebuilt one feature at a time via the open fr-author Agent Skill.

Previous FR enumerations were archived 2026-05-14 and are no longer reflected on this page. PRD/SRS narrative remains authoritative for the spec; specific FRs land here as they are re-authored.

10

Non-Functional Requirements

NFR IDConcernTargetMeasurement
N(FR pending)Phase-1 routing p95≤ 5 ms (catalog cached)fixtures/golden_routing.json benchmark
N(FR pending)Phase-2 LLM cascade p95≤ 800 ms incl. networkAI Gateway latency budget
N(FR pending)Catalog refresh p95≤ 50 ms over 100 skillscatalog.scan() benchmark
N(FR pending)Phase-1 determinism100% (same query+catalog → same decision)15 routing fixtures, golden tests
N(FR pending)Escalation rate to LLM≤ 10% of queries (after warm-up)BRAIN audit replay · weekly KPI
N(FR pending)Defer-to-human rate≤ 5% of queriesBRAIN audit replay
N(FR pending)Test coverage of router.py≥ 90% line · 100% branchcoverage.py
N(FR pending)Availability (in-process)same as callern/a — co-located
11

Dependencies

CUO is the most-connected module. It consumes BRAIN (records), Skill (invokes), AI Gateway (Phase 2), MCP Gateway (tool surface), AUTH (actor identity). It is consumed by every user-facing module.

graph LR subgraph upstream ["CUO depends on"] AUTH["🔐 AUTH"] AI["⚡ AI Gateway"] MCP["🔌 MCP Gateway"] BRAIN["🧠 BRAIN"] SKILL["🛠 SKILL"] end CUO_M["🎯 CUO"] subgraph downstream ["Used by user-facing modules"] CHAT["💬 CHAT"] EMAIL["✉️ EMAIL"] PROJ["📋 PROJ"] CRM["🤝 CRM"] HR["👥 HR"] KB["📚 KB"] OTHERS["…14 more"] end AUTH --> CUO_M AI --> CUO_M MCP --> CUO_M BRAIN --> CUO_M SKILL --> CUO_M CUO_M --> CHAT CUO_M --> EMAIL CUO_M --> PROJ CUO_M --> CRM CUO_M --> HR CUO_M --> KB CUO_M --> OTHERS classDef shipped fill:#f5ede6,stroke:#45210e classDef planned fill:#fef6e0,stroke:#9c750a class CUO_M,BRAIN,SKILL shipped class AUTH,AI,MCP,CHAT,EMAIL,PROJ,CRM,HR,KB,OTHERS planned
12

Compliance scope

Regulation / standardArticle / clauseCUO feature that satisfies it
EU AI Act (Reg. 2024/1689)Art. 12 — LoggingEvery decision recorded in BRAIN via memory_bridge · Postgres checkpointer at P2 retains LLM prompts.
EU AI ActArt. 13 — TransparencyEnd-of-response transparency: skill chosen + confidence + alternatives are surfaced to the user.
EU AI ActArt. 14 — Human oversightBelow-threshold queries defer to human; CUO never auto-invokes irreversible operations.
EU AI ActArt. 26 — Operator obligationsDefer-to-human matrix per persona (auto-OK vs defers) is normative.
EU AI Act Annex III§ 4 — High-risk classificationCUO does not perform employment / credit / law-enforcement scoring; classification remains limited-risk.
ISO/IEC 42001 (AIMS)§ 8.4 — AI system operationsAudit-chained decisions provide post-hoc accountability evidence.
Vietnam PDPLArt. 14 — Decision transparencyPer-decision rationale is part of the trace row; subject can request via DSAR.
13

Risk entries

IDRiskLikelihoodImpactOwnerMitigation
R-CUO-001Routing mis-classification (wrong skill picked at high confidence)MediumMediumCPO15 golden fixtures · Phase 4 persona pre-classifier · trust-calibration KPI alarmed at p99.
R-CUO-002Confidence threshold drift (real-world distribution diverges from fixtures)HighLowCPOWeekly KPI review: confidence histogram · escalation rate · defer rate. Threshold tunable per deployment at Phase 2.
R-CUO-003Persona prompt-injection (skill description tries to expand its own scope)MediumHighCSOTrust model (§7): skill descriptions are UNTRUSTED. Keyword bank + catalog are protocol-defined and version-controlled.
R-CUO-004LLM non-determinism breaks audit replay (Phase 2+)HighMediumCTOPhase 2 trace rows MUST include full prompt + model + temperature + seed; replay tools accept "best-effort" replay note.
R-CUO-005Persona switching whiplash (user feels Genie isn't "one" anymore at Phase 4)MediumLowCXO (emerging)Persona-version stamp on every decision · same conversational style enforced via brand-voice skill.
R-CUO-006Skill catalog explosion drops routing accuracyLow (today)MediumCPOPhase 4 persona-router triages first · Phase 2 LLM cascade handles the long tail.
R-CUO-007Capability bypass (CUO grants tools the skill didn't request)LowHighCSO§6.1–§6.2: CUO MUST respect the skill's allowed-tools. Defence in depth: Skill broker enforces independently.
14

KPIs

KPIFormulaSourceTargetCurrent
Routing confidence distributionhistogram of confBRAIN audit replaymean ≥ 0.6 · p10 ≥ 0.30.7 / 0.4 (15-fixture eval)
Escalation ratequeries with 0.10 ≤ conf ≤ 0.50BRAIN audit replay≤ 10%n/a — Phase 2 pending
Defer ratequeries with conf < 0.10BRAIN audit replay≤ 5%0% (fixtures)
Decision latency p95route() wall clockper-request timing≤ 5 ms (Phase 1)~ 0.4 ms
Replay equivalence rateidentical decision on second runfixture re-evaluation100% (Phase 1)100%
Invocation success rateexit_code 0 / total invocationsBRAIN audit replay≥ 95%100% (15 fixtures)
Trust calibration error|confidence − actual_correct_rate|weekly human review≤ 0.100.05 (fixture eval)
15

RACI matrix

ActivityCEOCPOCTOCXO*CSOCLO
Persona definition (10 C-level)ARCCII
Keyword bank maintenanceIRACII
Phase 2 LLM cascade designCCA/RICI
Trust calibration KPI reviewIRCAII
Defer-to-human matrixICCIIA/R
EU AI Act Art. 12 complianceICCICA/R
Prompt-injection defenceICRIAC

*CXO seat is emerging at P3+; the CPO carries this work today.

16

CLI usage — real examples

1. List the skills CUO can route to

$ cyberos-cuo catalog --format json | head -30
{
  "fingerprint": "9c8e2a...4b7d",
  "scanned_at": "2026-05-14T07:30:11Z",
  "skill_count": 20,
  "skills": [
    {"name": "cpo/prd-author", "version": "0.4.1", "region": null, "keywords": ["prd", "author", "draft prd", "product requirements"]},
    {"name": "cyberskill-vn/vn-mst-validate", "version": "0.2.0", "region": "VN", "keywords": ["mst", "tax code", "ma so thue"]},
    {"name": "cyberskill-vn/vn-vat-invoice",   "version": "0.3.0", "region": "VN", "keywords": ["invoice", "hoa don", "vat", "gtgt"]},
    ...
  ]
}

2. Route a query without invoking

$ cyberos-cuo route "kiểm tra MST 0123456789-001"

decision:
  skill_name: cyberskill-vn/vn-mst-validate
  arguments:  {mst: "0123456789-001"}
  confidence: 1.0
  rationale:  "VN-diacritic query + region=VN bonus + 2 keyword hits + name-substring match"
  routed:     true

alternatives:
  - cyberskill-vn/vn-tax-filing  score=0.3
  - cyberskill-vn/vn-vat-invoice score=0.2

(not invoked — pass --invoke to dispatch through the Skill host)

3. Route, invoke, and record in BRAIN

$ cyberos-cuo route "validate MST 0123456789" --invoke --record

decision:    cyberskill-vn/vn-mst-validate (conf=0.7)
invocation:  exit=0  elapsed_ms=24
stdout:      {"ok": true, "format": "10-digit"}
recorded:    brain://meta/cuo-decisions/1747200611_8c4e.md
             seq=14941  chain=a3c7...2b9f

4. Inspect a past decision (audit replay)

$ cyberos view meta/cuo-decisions/1747200611_8c4e.md
---
kind: decisions
sync_class: private
classification: internal
schema: cuo-decision-v1
---
# CUO routing decision

**Query:** validate MST 0123456789
**Catalog fingerprint:** 9c8e2a4b7d...
**Decision:** cyberskill-vn/vn-mst-validate
**Confidence:** 0.7
**Rationale:** 1 name-substring hit + 1 keyword + region tiebreaker
**Alternatives:** [...]
**Invoked at:** 2026-05-14T07:30:11Z
**Result:** exit=0, stdout={"ok": true}

5. Show keyword bank + extractors for a skill

$ cyberos-cuo skills vn-vat-invoice

skill:        cyberskill-vn/vn-vat-invoice
region:       VN
keywords:     [invoice, hoa don, vat, gtgt, e-invoice, xuat hoa don]
extractor:    detect-amount
                pattern: r"(\d[\d.,]*)\s*(triệu|trieu|million|k|VND|đồng)"
                ※ structured extraction deferred to Phase 2
depends_on:   [vn-mst-validate]  ← Phase 3 will walk this
allowed_tools: [read_file, write_file]
17

Phase status & code stats

Total LoC (Python)
~800
core/ + cli/ + tests/
Test count
15 pytest + 15 fixtures
golden-test routing
Core modules
6
catalog · router · invoker · trace · memory_bridge · __init__
Personas defined
10
+ 4 emerging
Confidence threshold
0.30
protocol-fixed in Phase 1
Routing latency p95
~ 0.4 ms
in-process, catalog cached
Phase / capabilityStatus
Phase 1 — rule-based router · catalog · invoker · BRAIN bridgeshipped
15 golden routing fixtures + pytestshipped
Phase 2 — LangGraph + LiteLLM cascadeplanned · M+3
Postgres checkpointer (EU AI Act Art. 12)planned · M+3
Phase 3 — multi-skill chains via depends_onplanned · M+6
Phase 4 — per-persona keyword bank splitplanned · M+9
Ambient nudge modes (Notify · Question · Review)planned
GraphQL subgraphplanned · P0+
18

References

  • PRD §6.1–§6.10 — CUO persona structure, voice, routing logic, trust calibration, ambient nudges, LangGraph migration.
  • PRD §9.2 — CUO / GENIE deep-dive.
  • SRS §4.2 — Formal FR catalog for CUO.
  • AGENTS.md (v0.1.0, normative) — cyberos/cuo/docs/AGENTS.md.
  • SPEC.md — contract summary — cyberos/cuo/docs/SPEC.md.
  • ROUTING.md — keyword-bank rationale + Phase 2 LLM design — cyberos/cuo/docs/ROUTING.md.
  • Source: cyberos/cuo/cuo/core/ · cyberos/cuo/tests/ · cyberos/cuo/fixtures/.
  • CHANGELOG: cyberos/cuo/docs/CHANGELOG.md (newest-first).