🎯CEO · Vision & Strategy Stephen Cheng (Founder seat)
Strategy memos, OKR cascade reviews, board narrative, weekly state-of-business, runway / fundraising posture. Owner of vision and capital allocation.
The Chief Universal Officer — natural language in, skill chain out, memory record on file.
CUO is the agentic orchestrator above the BRAIN and the Skill catalog. The user sees one persona (Genie, the mascot); inside the router, ten C-level sub-skills load on demand via the open Agent Skills format. CUO parses a query, scores every skill in the catalog, dispatches the winner through the Skill module, and records the routing decision in the BRAIN audit chain so the orchestrator's behaviour is replayable and reviewable. Phase 1 ships a deterministic rule-based router (200 lines · 30 pytest tests · 15 fixtures). Phase 2 adds an LLM-driven router behind a confidence cascade; Phase 3 enables multi-skill chains via topological dependency walking; Phase 4 splits the keyword bank by C-level persona. The BCP-14 protocol is at cuo/docs/AGENTS.md; the implementation is at cuo/cuo/core/.
CUO is one persona — Genie — backed by ten C-level specialists that load on demand. It is deliberately minimal: parse the query, score candidates, pick one, invoke through the Skill host, write the decision row to BRAIN. The router is rule-based today (deterministic, sub-millisecond) and will layer in an LLM cascade at Phase 2 (the trade is latency for ambiguous queries). Every decision is itself an audit-chained memory, so every routing choice CUO ever made is replayable from disk.
Internal operations is full of obvious requests that should resolve in one round trip: "validate this MST", "generate a VAT invoice for ACME", "draft a 1:1 prep doc for Thursday with Hanh". Without an orchestrator, every such request goes through a 200ms LLM call, every catalog lookup costs another 200ms, and every audit fact has to be re-implemented in the calling skill. CUO is the single layer that absorbs that pattern: rule-based fast path for unambiguous queries, LLM cascade for the ambiguous tail, BRAIN-anchored audit for every decision.
90% of queries are obvious. Rule-based decisions resolve in < 1 ms. Only the ambiguous 10% need a Phase-2 LLM call.
Same query + same catalog → same decision (Phase 1). LLM Phase 2 logs full prompt + model + temperature so the audit row is still replay-sufficient.
Below the confidence threshold, CUO refuses to invoke. It surfaces the top three candidates and asks the user to choose — a hard EU AI Act Art. 26 oversight guarantee.
CUO is also the entry-point that lets the company hire one persona — Genie — and gradually grow ten specialist skills behind it without changing the user-facing interface. The same Slack thread / chat box that asked Genie to "validate MST 0123456789" yesterday can ask it to "draft the Q3 OKR cascade" tomorrow, and the right C-level skill will load.
| Axis | Question | Answer |
|---|---|---|
| 5W · What | What is CUO? | An agentic orchestrator. Parses NL → scores catalog skills → invokes top match → records decision. The orchestrator state is per-request; CUO itself is stateless between requests. |
| 5W · Who | Who interacts? | Users: every CyberSkill member via the Genie chat box. Agents: external Claude / Codex / Cursor sessions that hand off to CUO when they hit a CyberOS surface. Owner: CEO seat today; CXO seat at P3+. |
| 5W · When | When is it invoked? | On every NL request to the platform that is not a direct CRUD on a known entity. Phase 1 is request-scoped; Phase 3 introduces multi-step chains that re-enter routing for each step. |
| 5W · Where | Where does it run? | Co-located with the user (Tauri / CLI / IDE). LLM cascade calls the AI Gateway (LiteLLM). All decisions land in the same BRAIN as the user's other memories. |
| 5W · Why | Why this design? | Because the alternative is asking the user to know which skill to call, or shipping an enormous monolithic prompt that re-derives the catalog every turn. CUO is the cheapest layer that gives Genie a real, replayable decision policy. |
| 1H · How | How does it route? | Score per candidate = 5.0 if skill_name in query else 0 + 3.0 × keyword_hits + 2.0 if VN-diacritic AND skill.region=VN else 0. Top scorer wins if score > 3.0 (confidence ≥ 0.30); otherwise emit routed:false. |
| 2C · Cost | Cost per decision? | Phase 1 ≈ 0.4 ms per route (in-process). Phase 2 LLM cascade: only when ≤ 0.50 confidence AND ≥ 0.10 candidate exists; typical 150 ms. Memory: 1 row per decision in BRAIN, ~700 bytes. |
| 2C · Constraints | Constraints? | (a) Phase 1 MUST be deterministic — same query + same catalog → same decision. (b) Below threshold MUST defer-to-human; no auto-invoke. (c) Skill capabilities MUST be respected via Skill host's broker; CUO cannot bypass. |
| 5M · Materials | What does it use? | The Skill catalog (read from disk in sorted-path order), a per-skill keyword bank in cuo/core/router.py, optional BRAIN context for Phase 2 LLM cascade, and the AI Gateway for inference. |
| 5M · Methods | Method choices? | Rule-based scoring (Phase 1), LLM cascade (Phase 2 LangGraph + LiteLLM), topological chain walking (Phase 3), per-persona keyword bank (Phase 4). Each layered on top of the previous, none replacing it. |
| 5M · Machines | Where does it run? | Locally with the user (Tauri host) for the rule-based path; AI Gateway for LLM calls. Postgres checkpointer for LangGraph state (Phase 2+) — required for EU AI Act Art. 12 logging. |
| 5M · Manpower | Who maintains? | 1 IC owner today. By P1 exit the CPO and CTO co-own the keyword bank + persona definitions. At P3 the dedicated CXO seat appears. |
| 5M · Measurement | How measured? | Routing confidence distribution, escalation rate, decision latency, defer-to-human rate. KPIs in §13. |
Six modules in cuo/cuo/core/ form the entire surface. Catalog discovers skills off disk. Router scores. Invoker delegates to the Skill module. Memory-bridge writes the decision. Trace renders the structured row.
| Component | File | Responsibility |
|---|---|---|
catalog.py | core/catalog.py | Discover Skill manifests off disk in sorted-path order. Cached per-request. |
router.py | core/router.py | Phase 1 rule-based scorer. Per-skill _KEYWORD_BANK + ARG_EXTRACTORS. Returns (decision, alternatives). |
invoker.py | core/invoker.py | Shell out to cyberos-skill-cli run <skill>. Capture stdout / stderr / exit-code. No in-process skill execution. |
trace.py | core/trace.py | Render structured trace row: query, decision, alternatives, result, timestamps. |
memory_bridge.py | core/memory_bridge.py | Write trace row to BRAIN. Phase 1: flat file under meta/cuo-decisions/<ts_ns>.md. Phase 2: through canonical Writer. |
| Phase | What changes | Why | Status |
|---|---|---|---|
| Phase 1 | Rule-based scorer · pure-function extractors · flat-file BRAIN bridge | Determinism, sub-ms latency, no external LLM dep | shipped |
| Phase 2 | LangGraph supervisor + LiteLLM router · Postgres checkpointer · escalate when 0.10 ≤ score ≤ 0.50 | Handle ambiguous tail · EU AI Act Art. 12 logging via Postgres | planned |
| Phase 3 | Multi-skill chains via depends_on · composite audit row + sub-rows | Compound workflows (MST validate → VAT invoice) | planned |
| Phase 4 | Per-C-level keyword bank · persona-router-first → intra-persona routing | Catalog scaling beyond ~50 skills | planned |
CUO owns three entities. A SkillEntry is the projection of a Skill manifest into the router's catalog. A RoutingDecision is the result of scoring. An InvocationResult captures what the Skill host actually did.
extend schema
@link(url: "https://specs.apollo.dev/federation/v2.5", import: ["@key"])
type RoutingDecision @key(fields: "requestId") {
requestId: ID!
query: String!
actor: String!
skillName: String # null when routed=false
arguments: JSON
confidence: Float!
rationale: String!
alternatives: [Candidate!]!
routed: Boolean!
routerPhase: RouterPhase!
invokedAt: DateTime
result: InvocationResult # null if --invoke not requested
}
type Candidate {
skillName: String!
score: Float!
rank: Int!
}
type InvocationResult {
exitCode: Int!
stdout: String!
stderr: String!
startedAt: DateTime!
endedAt: DateTime!
}
enum RouterPhase { phase1_rule phase2_llm phase3_chain phase4_persona }
type Query {
route(query: String!, persona: Persona): RoutingPreview!
decision(requestId: ID!): RoutingDecision
decisions(actor: String, since: DateTime, limit: Int = 50): [RoutingDecision!]!
}
type Mutation {
routeAndInvoke(query: String!, record: Boolean = true): RoutingDecision!
invokeSkill(skillName: String!, arguments: JSON!): InvocationResult!
}
type RoutingPreview {
decision: RoutingDecision!
catalogFingerprint: String! # for replay
}
| Tool name | Inputs | Outputs | Annotations |
|---|---|---|---|
cuo.route | query, persona? | RoutingDecision | readonly · pure · scope=route |
cuo.route_and_invoke | query, record=true | RoutingDecision + result | destructive=true · scope=invoke |
cuo.catalog | — | SkillEntry[] | readonly · cached · scope=read |
cuo.explain | requestId | rationale + alternatives | readonly · scope=audit |
cyberos-cuo| Subcommand | Purpose | Example |
|---|---|---|
cyberos-cuo catalog | List skills CUO can route to | cyberos-cuo catalog --format json |
cyberos-cuo route | Score a query, optionally invoke + record | cyberos-cuo route "validate MST 0123456789" --invoke --record |
cyberos-cuo skills | Show keyword bank + extractors per skill | cyberos-cuo skills vn-mst-validate |
Phase 1 ships the ≥ 0.30 threshold; the four-tier cascade above lands at Phase 2 once the LLM router is online.
Each persona is a curated subset of the Skill catalog. Today the catalog lives under skill/skills/cuo/<persona>/ and is loaded uniformly; Phase 4 splits the keyword bank per-persona so the router classifies intent first, then routes intra-persona. The Auto OK column lists actions that may complete without explicit operator approval; the Defers column lists actions that always escape to the human.
Strategy memos, OKR cascade reviews, board narrative, weekly state-of-business, runway / fundraising posture. Owner of vision and capital allocation.
Cycle status digests, blocker triage, cross-team coordination, weekly ops review, vendor performance. Owner of "did we ship".
Cashflow position, AR/AP digests, burn alerts, payroll cycles, VAT/CIT compliance posture. Owner of "do we have runway".
Campaign briefs, content calendar, channel reports, brand voice consistency. Owner of "do prospects know us".
Tech-debt triage, security advisories digest, OBS metric review, dependency upgrades, architecture decision records. Owner of "is the platform safe and fast".
1:1 prep, performance summaries, onboarding paths, role descriptions, retention signals, growth ladders. Owner of "do we have the right people".
Competitive intel, scenario modelling, M&A scanning, partnership feasibility, strategic option papers. Owner of "what are we doing next".
Contract redline, NDA triage, GDPR/PDPL audits, DSAR triage, vendor terms review, policy authoring. Owner of "are we compliant".
Data quality, lineage, residency reviews, schema governance, retention policy, BRAIN integrity oversight. Owner of "is our data trustworthy".
PRD drafts, roadmap analysis, requirements discovery, user-research synthesis, FR catalogue maintenance. Owner of "are we building the right thing". Today's most-exercised persona — the FR-author / SRS-author / requirements-discovery skills all live here.
Four additional C-level seats are watched but not provisioned at P0. The router treats their skills as belonging to the closest existing persona until the seat is split out.
Owns the AI Gateway, model selection, prompt governance, AI risk register. Currently rolled into CTO. Split-out at P2 when GA AI Act compliance becomes operational load.
Owns CUO persona consistency, end-to-end member experience, ambient nudges. Today is a CEO concern; split-out at P3 when external tenants arrive.
Owns pipeline, win rates, churn signals, expansion. Today rolled into CEO + CFO. Split-out at P4 when external SaaS begins selling.
Owns ESG metrics, climate reporting, vendor sustainability scoring. Watched for 2026 EU reporting evolution.
The CyberOS FR catalogue is being rebuilt one feature at a time via the open fr-author Agent Skill.
Previous FR enumerations were archived 2026-05-14 and are no longer reflected on this page. PRD/SRS narrative remains authoritative for the spec; specific FRs land here as they are re-authored.
| NFR ID | Concern | Target | Measurement |
|---|---|---|---|
N(FR pending) | Phase-1 routing p95 | ≤ 5 ms (catalog cached) | fixtures/golden_routing.json benchmark |
N(FR pending) | Phase-2 LLM cascade p95 | ≤ 800 ms incl. network | AI Gateway latency budget |
N(FR pending) | Catalog refresh p95 | ≤ 50 ms over 100 skills | catalog.scan() benchmark |
N(FR pending) | Phase-1 determinism | 100% (same query+catalog → same decision) | 15 routing fixtures, golden tests |
N(FR pending) | Escalation rate to LLM | ≤ 10% of queries (after warm-up) | BRAIN audit replay · weekly KPI |
N(FR pending) | Defer-to-human rate | ≤ 5% of queries | BRAIN audit replay |
N(FR pending) | Test coverage of router.py | ≥ 90% line · 100% branch | coverage.py |
N(FR pending) | Availability (in-process) | same as caller | n/a — co-located |
CUO is the most-connected module. It consumes BRAIN (records), Skill (invokes), AI Gateway (Phase 2), MCP Gateway (tool surface), AUTH (actor identity). It is consumed by every user-facing module.
| Regulation / standard | Article / clause | CUO feature that satisfies it |
|---|---|---|
| EU AI Act (Reg. 2024/1689) | Art. 12 — Logging | Every decision recorded in BRAIN via memory_bridge · Postgres checkpointer at P2 retains LLM prompts. |
| EU AI Act | Art. 13 — Transparency | End-of-response transparency: skill chosen + confidence + alternatives are surfaced to the user. |
| EU AI Act | Art. 14 — Human oversight | Below-threshold queries defer to human; CUO never auto-invokes irreversible operations. |
| EU AI Act | Art. 26 — Operator obligations | Defer-to-human matrix per persona (auto-OK vs defers) is normative. |
| EU AI Act Annex III | § 4 — High-risk classification | CUO does not perform employment / credit / law-enforcement scoring; classification remains limited-risk. |
| ISO/IEC 42001 (AIMS) | § 8.4 — AI system operations | Audit-chained decisions provide post-hoc accountability evidence. |
| Vietnam PDPL | Art. 14 — Decision transparency | Per-decision rationale is part of the trace row; subject can request via DSAR. |
| ID | Risk | Likelihood | Impact | Owner | Mitigation |
|---|---|---|---|---|---|
R-CUO-001 | Routing mis-classification (wrong skill picked at high confidence) | Medium | Medium | CPO | 15 golden fixtures · Phase 4 persona pre-classifier · trust-calibration KPI alarmed at p99. |
R-CUO-002 | Confidence threshold drift (real-world distribution diverges from fixtures) | High | Low | CPO | Weekly KPI review: confidence histogram · escalation rate · defer rate. Threshold tunable per deployment at Phase 2. |
R-CUO-003 | Persona prompt-injection (skill description tries to expand its own scope) | Medium | High | CSO | Trust model (§7): skill descriptions are UNTRUSTED. Keyword bank + catalog are protocol-defined and version-controlled. |
R-CUO-004 | LLM non-determinism breaks audit replay (Phase 2+) | High | Medium | CTO | Phase 2 trace rows MUST include full prompt + model + temperature + seed; replay tools accept "best-effort" replay note. |
R-CUO-005 | Persona switching whiplash (user feels Genie isn't "one" anymore at Phase 4) | Medium | Low | CXO (emerging) | Persona-version stamp on every decision · same conversational style enforced via brand-voice skill. |
R-CUO-006 | Skill catalog explosion drops routing accuracy | Low (today) | Medium | CPO | Phase 4 persona-router triages first · Phase 2 LLM cascade handles the long tail. |
R-CUO-007 | Capability bypass (CUO grants tools the skill didn't request) | Low | High | CSO | §6.1–§6.2: CUO MUST respect the skill's allowed-tools. Defence in depth: Skill broker enforces independently. |
| KPI | Formula | Source | Target | Current |
|---|---|---|---|---|
| Routing confidence distribution | histogram of conf | BRAIN audit replay | mean ≥ 0.6 · p10 ≥ 0.3 | 0.7 / 0.4 (15-fixture eval) |
| Escalation rate | queries with 0.10 ≤ conf ≤ 0.50 | BRAIN audit replay | ≤ 10% | n/a — Phase 2 pending |
| Defer rate | queries with conf < 0.10 | BRAIN audit replay | ≤ 5% | 0% (fixtures) |
| Decision latency p95 | route() wall clock | per-request timing | ≤ 5 ms (Phase 1) | ~ 0.4 ms |
| Replay equivalence rate | identical decision on second run | fixture re-evaluation | 100% (Phase 1) | 100% |
| Invocation success rate | exit_code 0 / total invocations | BRAIN audit replay | ≥ 95% | 100% (15 fixtures) |
| Trust calibration error | |confidence − actual_correct_rate| | weekly human review | ≤ 0.10 | 0.05 (fixture eval) |
| Activity | CEO | CPO | CTO | CXO* | CSO | CLO |
|---|---|---|---|---|---|---|
| Persona definition (10 C-level) | A | R | C | C | I | I |
| Keyword bank maintenance | I | R | A | C | I | I |
| Phase 2 LLM cascade design | C | C | A/R | I | C | I |
| Trust calibration KPI review | I | R | C | A | I | I |
| Defer-to-human matrix | I | C | C | I | I | A/R |
| EU AI Act Art. 12 compliance | I | C | C | I | C | A/R |
| Prompt-injection defence | I | C | R | I | A | C |
*CXO seat is emerging at P3+; the CPO carries this work today.
$ cyberos-cuo catalog --format json | head -30
{
"fingerprint": "9c8e2a...4b7d",
"scanned_at": "2026-05-14T07:30:11Z",
"skill_count": 20,
"skills": [
{"name": "cpo/prd-author", "version": "0.4.1", "region": null, "keywords": ["prd", "author", "draft prd", "product requirements"]},
{"name": "cyberskill-vn/vn-mst-validate", "version": "0.2.0", "region": "VN", "keywords": ["mst", "tax code", "ma so thue"]},
{"name": "cyberskill-vn/vn-vat-invoice", "version": "0.3.0", "region": "VN", "keywords": ["invoice", "hoa don", "vat", "gtgt"]},
...
]
}
$ cyberos-cuo route "kiểm tra MST 0123456789-001"
decision:
skill_name: cyberskill-vn/vn-mst-validate
arguments: {mst: "0123456789-001"}
confidence: 1.0
rationale: "VN-diacritic query + region=VN bonus + 2 keyword hits + name-substring match"
routed: true
alternatives:
- cyberskill-vn/vn-tax-filing score=0.3
- cyberskill-vn/vn-vat-invoice score=0.2
(not invoked — pass --invoke to dispatch through the Skill host)
$ cyberos-cuo route "validate MST 0123456789" --invoke --record
decision: cyberskill-vn/vn-mst-validate (conf=0.7)
invocation: exit=0 elapsed_ms=24
stdout: {"ok": true, "format": "10-digit"}
recorded: brain://meta/cuo-decisions/1747200611_8c4e.md
seq=14941 chain=a3c7...2b9f
$ cyberos view meta/cuo-decisions/1747200611_8c4e.md
---
kind: decisions
sync_class: private
classification: internal
schema: cuo-decision-v1
---
# CUO routing decision
**Query:** validate MST 0123456789
**Catalog fingerprint:** 9c8e2a4b7d...
**Decision:** cyberskill-vn/vn-mst-validate
**Confidence:** 0.7
**Rationale:** 1 name-substring hit + 1 keyword + region tiebreaker
**Alternatives:** [...]
**Invoked at:** 2026-05-14T07:30:11Z
**Result:** exit=0, stdout={"ok": true}
$ cyberos-cuo skills vn-vat-invoice
skill: cyberskill-vn/vn-vat-invoice
region: VN
keywords: [invoice, hoa don, vat, gtgt, e-invoice, xuat hoa don]
extractor: detect-amount
pattern: r"(\d[\d.,]*)\s*(triệu|trieu|million|k|VND|đồng)"
※ structured extraction deferred to Phase 2
depends_on: [vn-mst-validate] ← Phase 3 will walk this
allowed_tools: [read_file, write_file]
| Phase / capability | Status |
|---|---|
| Phase 1 — rule-based router · catalog · invoker · BRAIN bridge | shipped |
| 15 golden routing fixtures + pytest | shipped |
| Phase 2 — LangGraph + LiteLLM cascade | planned · M+3 |
| Postgres checkpointer (EU AI Act Art. 12) | planned · M+3 |
Phase 3 — multi-skill chains via depends_on | planned · M+6 |
| Phase 4 — per-persona keyword bank split | planned · M+9 |
| Ambient nudge modes (Notify · Question · Review) | planned |
| GraphQL subgraph | planned · P0+ |
cyberos/cuo/docs/AGENTS.md.cyberos/cuo/docs/SPEC.md.cyberos/cuo/docs/ROUTING.md.cyberos/cuo/cuo/core/ · cyberos/cuo/tests/ · cyberos/cuo/fixtures/.cyberos/cuo/docs/CHANGELOG.md (newest-first).