0
Eight tiers, one supergraph
CyberOS has more than 8 distinct concerns (we enumerate 16 below), but they cluster cleanly into 8 architectural tiers. Each tier is a separate deployment unit, scales independently, and exposes a stable contract to the tier above.
flowchart TB
subgraph T1 ["T1 · Persona / Agent layer"]
LANG["LangGraph supervisor (StateGraph + interrupt)"]
SKILLS["Anthropic Skills format · 10 C-level skills hot-reload"]
LITELLM_T1["LiteLLM client (routing core)"]
end
subgraph T2 ["T2/T3 · Frontend layer"]
HOST["Host shell · Vite + React 19 + Tauri (desktop)"]
REMOTES["Module remotes · Webpack 5 + Module Federation"]
end
subgraph T3 ["T4/T5/T6 · API + Agent surface"]
APOLLO["Apollo Router · GraphQL Federation v2.5+"]
MCPGW["MCP Gateway · Streamable HTTP · 2025-11-25"]
AIGW["AI Gateway · LiteLLM router · Bedrock primary"]
end
subgraph T4 ["T7 · Backend services"]
SUBGRAPHS["22 subgraphs · TypeScript (Yoga) or Rust (async-graphql)"]
MCPSERVERS["22 MCP servers · per-module · TS SDK or mcp-rs"]
end
subgraph T5 ["T8/T9 · Data + search"]
PG["PostgreSQL 17 + pgvector HNSW + Apache AGE 1.5 + PGroonga"]
EMBED["BGE-M3 embedder + BGE-rerank-v2-m3 (self-hosted)"]
end
subgraph T6 ["T10/T11 · Infrastructure"]
NATS_T["NATS JetStream (event spine)"]
S3["S3 / R2 / MinIO (object storage)"]
end
subgraph T7 ["T12/T13/T14 · Cryptography + sync"]
YJS["Yjs / Automerge (CRDTs for realtime)"]
CRYPTO["Ed25519 + scrypt key wrap + MMR + STH"]
LEDGER["msgspec canonical JSON · binlog framing"]
end
subgraph T8 ["T15/T16 · Compliance + UX"]
OPA["OPA + Conftest (policy)"]
TRUST["Trust Center (cert hosting)"]
BVP["Be Vietnam Pro · CyberSkill design system"]
end
T1 --> T2
T2 --> T3
T3 --> T4
T4 --> T5
T4 --> T6
T4 --> T7
T4 --> T8
classDef t1 fill:#f9c64f,stroke:#9c750a
classDef t2 fill:#e8d4c2,stroke:#45210e
classDef t3 fill:#fef6e0,stroke:#9c750a
classDef t4 fill:#f5ede6,stroke:#45210e
classDef t5 fill:#cba88a,stroke:#45210e
classDef t6 fill:#fde7b3,stroke:#9c750a
classDef t7 fill:#fee2e2,stroke:#b91c1c
classDef t8 fill:#f0eee9,stroke:#475569
class LANG,SKILLS,LITELLM_T1 t1
class HOST,REMOTES t2
class APOLLO,MCPGW,AIGW t3
class SUBGRAPHS,MCPSERVERS t4
class PG,EMBED t5
class NATS_T,S3 t6
class YJS,CRYPTO,LEDGER t7
class OPA,TRUST,BVP t8
Three design constraints driving every pick
- 1. Vietnamese data sovereignty — no SaaS dependency where Vietnamese-origin personal data must travel through a US-based vendor's servers. AWS Bedrock is acceptable because of the ap-southeast-1 region; OpenAI direct is not (no Singapore endpoint).
- 2. Cost ceiling at scale — ≤ $150/mo LLM + $230/mo infra at 10-Member internal; ≤ $4/active user/mo LLM + $2,200/mo infra at 50-tenant (N(FR pending), N(FR pending)). Anything that doesn't fit that envelope is rejected.
- 3. Migration door always open — every pick has a documented escape hatch. Storage is S3-compatible (DEC-005), so R2 ↔ MinIO ↔ AWS S3 is a config flip. SQL is portable, audit chain is exportable, MCP servers are spec-conforming.
17
"What calls what" — dependency graph
The tiers compose left-to-right: every request from a user or agent flows through this graph. Cycles are forbidden by design.
One request traverses every tier — sequence view
sequenceDiagram
autonumber
actor U as User
participant T2 as T2 Host shell
participant T3 as T3 Module remote
participant T4 as T4 Apollo Router
participant T7 as T7 Subgraph (Bun/Tokio)
participant T8 as T8 Postgres + ext
participant T6 as T6 AI Gateway
participant T9 as T9 BGE-M3 (GPU)
participant T10 as T10 NATS
participant T11 as T11 R2
participant T14 as T14 Audit ledger
U->>T2: open route
T2->>T3: lazy-load remote
T3->>T4: persisted query hash
T4->>T7: federated query plan
T7->>T8: SELECT (RLS-scoped)
T8-->>T7: rows
T7->>T6: POST /v1/embeddings
T6->>T9: BGE-M3 self-hosted
T9-->>T6: vector
T6-->>T7: embed
T7->>T8: SELECT pgvector
T8-->>T7: hits
T7->>T10: publish event
T7->>T11: write attachment (if any)
T7->>T14: append audit row (msgspec canonical)
T7-->>T4: response
T4-->>T3: composed result
T3-->>T2: render
T2-->>U: paint
flowchart LR
USER[("User · Agent")] --> HOST["Host shell
Vite + React 19"]
HOST --> REMOTE["Module remote
Webpack 5 + MF"]
REMOTE --> APOLLO["Apollo Router"]
APOLLO --> AUTH["AUTH JWKS"]
APOLLO --> SUBG["Subgraph (TS/Rust)"]
SUBG --> PG[("Postgres 17 + ext.")]
SUBG --> AIGW["AI Gateway"]
SUBG --> NATS_DEP[("NATS JetStream")]
SUBG --> S3_DEP[("R2 / MinIO")]
AIGW --> LL["LiteLLM"]
LL --> BEDROCK["AWS Bedrock"]
LL --> ANT["Anthropic ZDR"]
LL --> OAI["OpenAI ZDR"]
LL --> BGE["BGE-M3 (self-hosted GPU)"]
USER --> MCPCLT[("MCP client
Claude / Cursor")]
MCPCLT --> MCPGW["MCP Gateway"]
MCPGW --> AUTH
MCPGW --> SUBG
SUBG -. trace .-> OBS["OBS · OTel"]
APOLLO -. trace .-> OBS
AIGW -. trace .-> OBS
MCPGW -. trace .-> OBS
classDef u fill:#fef6e0,stroke:#9c750a
classDef fe fill:#e8d4c2,stroke:#45210e
classDef gw fill:#f9c64f,stroke:#9c750a
classDef be fill:#f5ede6,stroke:#45210e
classDef data fill:#cba88a,stroke:#45210e
classDef ext fill:#fde7b3,stroke:#9c750a
class USER,MCPCLT u
class HOST,REMOTE fe
class APOLLO,MCPGW,AIGW,AUTH gw
class SUBG be
class PG,NATS_DEP,S3_DEP data
class LL,BEDROCK,ANT,OAI,BGE,OBS ext
18
Cost-vs-tier model
Two reference scales — 10 Members internal (P0–P2) and 50 tenants (P4 GA). Each tier's contribution maps to a hard NFR ceiling.
Cost flow — where the dollar goes at internal scale
flowchart LR
BUDGET[("$535/mo
N(FR pending) envelope")] --> LLM["28% · LLM
$150 · primarily Sonnet + Haiku via Bedrock"]
BUDGET --> COMPUTE["17% · K8s compute
$90 · 22 subgraphs + gateways"]
BUDGET --> PG["15% · Postgres
$80 · primary + read replica"]
BUDGET --> OBS_C["15% · OBS (LGTM)
$80 · Loki + Tempo + Mimir + Grafana"]
BUDGET --> GPU["15% · GPU embed
$80 · shared BGE-M3 node"]
BUDGET --> STORE["5% · object storage
$25 · R2 zero-egress"]
BUDGET --> NATS_C["4% · NATS
$20 · single-node JetStream"]
BUDGET --> EDGE["1% · CDN + auth
$10"]
classDef envelope fill:#fef6e0,stroke:#9c750a,stroke-width:2px
classDef llm fill:#f9c64f,stroke:#9c750a
classDef compute fill:#f5ede6,stroke:#45210e
classDef data fill:#e8d4c2,stroke:#45210e
classDef obs fill:#fde7b3,stroke:#9c750a
classDef ext fill:#cba88a,stroke:#45210e
class BUDGET envelope
class LLM,GPU llm
class COMPUTE compute
class PG,STORE data
class OBS_C obs
class NATS_C,EDGE ext
xychart-beta
title "Monthly cost per tier ($USD)"
x-axis ["LLM (T1+T6)", "Postgres (T8)", "Compute (T7)", "Storage (T11)", "OBS (LGTM)", "NATS (T10)", "AUTH (T4-side)", "Embeddings GPU (T9)"]
y-axis "USD per month" 0 --> 600
bar [150, 80, 90, 25, 80, 20, 5, 80]
Internal scale (10 Members) — total ≤ $530/mo against N(FR pending) budget of $530/mo ($150 LLM + $380 infra)
xychart-beta
title "Cost shape at 50-tenant scale ($USD/month)"
x-axis ["LLM", "Postgres (3 regions)", "Compute (k8s)", "Storage", "OBS", "NATS (cluster)", "AUTH", "GPU embed"]
y-axis "USD per month" 0 --> 1400
bar [800, 600, 500, 200, 200, 100, 50, 200]
50-tenant scale — total ≤ $2,650/mo against N(FR pending) budget of $2,200 + $4/user/mo LLM
Per-tier production cost (M+6, M+18 projections)
| Tier | Pick | Internal (10 Members) | 50-tenant scale | Migration door |
| T1 Persona / Agent | LangGraph + LiteLLM | $0 host | $0 host | Replace LangGraph supervisor |
| T2 Host shell | Vite + React 19 + Tauri | $5/mo CDN | $50/mo CDN | Switch host to Next.js |
| T3 Module remotes | Webpack 5 + MF | included | included | Pin MF v2 spec |
| T4 Apollo Router | Apollo Router | $0 (OSS binary) | $50/mo VM cluster | Elastic License v1.2 review |
| T5 MCP Gateway | Custom router + per-module servers | $0 (in-cluster) | $30/mo | MCP spec preserves portability |
| T6 AI Gateway | LiteLLM + Bedrock primary | $150/mo | $800/mo (+ per-user) | Provider mix via config |
| T7 Backend | Bun / Tokio · 22 subgraphs | $90/mo k8s | $500/mo k8s | Containers, portable |
| T8 Data | Postgres 17 + pgvector + AGE + PGroonga | $80/mo | $600/mo (3 regions) | SQL portable |
| T9 Embeddings | BGE-M3 + reranker (GPU) | $80/mo | $200/mo (multi-GPU) | Switch to OpenAI text-embed |
| T10 Event bus | NATS JetStream | $20/mo VM | $100/mo cluster | NATS subjects → Kafka topics |
| T11 Object storage | R2 / MinIO | $25/mo | $200/mo | S3-compatible config flip |
| T12 CRDT sync | Yjs / Automerge | $0 (libs) | $0 (libs) | Doc-format-portable |
| T13–14 Cryptography + Ledger | Ed25519 + MMR + msgspec | $0 | $0 | Schema-portable |
| T15 Compliance | OPA + Trust Center | $5/mo static host | $20/mo | OPA Rego portable |
| OBS (LGTM) | Grafana / Loki / Tempo / Mimir | $80/mo | $200/mo | OTel-native; switch backend |
| Total | — | ≤ $535/mo | ~$2,750/mo | — |