Roadmap¶
ProbOS is organized as a starship crew — specialized teams of agents working together to keep the system operational, secure, and evolving. Each team is a dedicated agent pool with distinct responsibilities. The Captain (human operator) approves major decisions through a stage gate.
ProbOS doesn't just orchestrate agents — it gives them a civilization to come together. Trust they earn, consensus they participate in, memory they share, relationships that strengthen through learning, a federation they can grow into. Other frameworks dispatch tasks. ProbOS provides the social fabric that makes cooperation emerge naturally.
Status tags (BF #465 reconciliation, 2026-05-07): AD entries below are tagged
(planned, OSS)for unbuilt work or(SHIPPED, OSS)once delivered. PROGRESS.md remains the authoritative source — this file lags. The 2026-05-07 reconciliation pass flipped 27 entries that had drifted (AD-486, 510, 512, 520, 526, 543-549, 562, 566, 567, 569, 571, 595, 597, 599, 601, 604, 607).
Design Principles¶
See Design Principles for the full design philosophy — architectural and philosophical principles that govern how ProbOS thinks about what it builds. Engineering practices (SOLID, DRY, Fail Fast) live in contributing.md.
Crew Structure¶
┌───────────────────────────┐
│ STARFLEET COMMAND │
│ Fleet Admiral = Creator │
└─────────────┬─────────────┘
│
┌─────────────┴─────────────┐
│ BRIDGE (Command) │
│ Captain = Human Operator │
│ First Officer = │
│ Architect Agent │
│ Counselor = │
│ Cognitive Wellness Agent │
└─────────────┬─────────────┘
│
┌──────────┬──────────┬───┴───┬──────────┬──────────┐
│ │ │ │ │ │
┌────┴───┐ ┌───┴────┐ ┌──┴───┐ ┌─┴──────┐ ┌┴───────┐ ┌┴──────────┐
│Medical │ │Engineer│ │Science│ │Security│ │ Ops │ │ Comms │
│ CMO │ │ Chief │ │ CSO │ │ Chief │ │ Chief │ │ Chief │
│Sickbay │ │ ing │ │ │ │Tactical│ │ │ │ │
└────────┘ └────────┘ └──────┘ └────────┘ └────────┘ └───────────┘
| Team | Starfleet Analog | ProbOS Function | Status |
|---|---|---|---|
| Medical | Sickbay (Crusher) | Health monitoring, diagnosis, remediation, post-mortems | Built (AD-290) |
| Engineering | Main Engineering (LaForge) | Performance optimization, architecture review, system optimization, builds | Built (LaForge + Scotty, AD-302/398) |
| Science | Science Lab (Spock) | Research, discovery, architectural analysis, codebase knowledge, intelligence gathering, telemetry analysis, emergence studies | Built (Architect, CodebaseIndex, Scout, Data Analyst, Systems Analyst, Research Specialist — AD-560 complete) |
| Security | Tactical (Worf) | Threat assessment, vulnerability review, code security audit, defense | Built (AD-398) |
| Operations | Ops (O'Brien) | Resource analysis, cross-department coordination, capacity planning, system efficiency | Built (AD-398) |
| Communications | Comms (Uhura) | Channel adapters, federation, external interfaces | Partial |
| Bridge | Command (Picard) | Strategic decisions, human approval gate, goal planning, cognitive wellness | Partial |
Chain of Command¶
"Humans are self-organizing and naturally form organizational hierarchies. Agents should do the same."
The chain of command has two levels: Bridge crew (ship-wide authority) and Department Chiefs (team-level authority). Bridge officers run the ship. Department Chiefs run their teams and report to the Bridge. Just like a newly commissioned starship gets its initial officer roster, ProbOS assigns defaults at startup — but rank is earned, not permanent.
Rank Structure:
| Rank | Scope | ProbOS Role | Assignment |
|---|---|---|---|
| Fleet Admiral | All ships | Creator / System Owner | Fixed (Sean) |
| Admiral | Fleet region | Federation coordinator | Future (multi-instance) |
| Captain | Single ship | Human operator | Fixed (human approval gate) |
| Bridge Crew | Ship-wide | Senior officers with cross-department authority | Default + promotable |
| Department Chief | One department | Lead agent — receives bridge orders, orchestrates team, reports back | Default + promotable |
| Crew | Individual role | Specialist agent — executes tasks within department | Default |
Bridge Crew:
The Bridge is where the ship is run. Bridge officers have ship-wide authority and report directly to the Captain.
| Bridge Role | Star Trek Analog | ProbOS Agent | Responsibility |
|---|---|---|---|
| Captain | Picard | Human operator | Final authority, approval gate, strategic direction |
| First Officer | Riker | ArchitectAgent | Cross-department coordination, strategic planning, mission execution. Dual-hatted as Chief Science Officer |
| Ship's Counselor | Troi | CounselorAgent (new) | Cognitive wellness, agent relationship health, Hebbian drift detection, advisory to Captain |
Bridge crew members may also hold department roles (dual-hatted). The ArchitectAgent is both First Officer and CSO. Future Bridge positions could include Helm (navigation/routing), Tactical (security chief on the Bridge), and Ops officer — added as those departments mature.
Default Department Chief Assignments:
| Department | Default Chief | Why |
|---|---|---|
| Medical | Diagnostician (CMO) | Natural triage point — already receives all alerts and routes to specialists |
| Engineering | EngineeringAgent (LaForge) | Systems thinker — architecture review, optimization, infrastructure health. Scotty (Builder) is senior officer |
| Science | ArchitectAgent (CSO / First Officer) | Dual-hatted — strategic analysis + science leadership |
| Security | SecurityAgent (Worf) | Cognitive security — threat assessment, vulnerability review, code security audit (AD-398) |
| Operations | OperationsAgent (O'Brien) | Resource analysis, cross-department coordination, capacity planning (AD-398) |
| Communications | TBD (Comms Chief) | Not yet built |
Promotion Mechanics:
Agents aren't locked into their initial rank. The system supports emergent hierarchy based on proven performance through formal qualification programs (see Naval Organization section):
- Eligibility — An agent becomes promotion-eligible when its trust score sustains above a threshold (e.g., 0.85+) for N consecutive evaluation cycles and its Hebbian weight for coordination-type tasks exceeds a minimum
- Qualification completion — The agent must have completed the qualification program for the target rank — a defined set of demonstrated competencies, not just metric thresholds. See Qualification Programs under Naval Organization below. Holodeck simulations provide the testing environment; Counselor assessments provide the evaluation
- Evaluation signals — Trust score trajectory, task success rate, Hebbian weight for cross-agent coordination, peer agent outcomes when this agent led (Shapley contribution to team results), qualification record
- Nomination — The system (or current Chief via Ward Room) nominates an eligible, qualified agent for promotion. The Ship's Counselor provides cognitive fitness assessment as part of the promotion review
- Captain approval gate — All promotions require human approval. The Captain sees the performance data, qualification record, Counselor's assessment, and confirms or denies. This is the same approval gate used for self-improvement proposals
- Demotion — If an officer's trust drops below threshold, cognitive wellness degrades (flagged by Counselor), or the Captain issues a direct order, the officer is demoted and the next-highest-trust eligible agent is promoted (with Captain approval)
Cross-Scale Hierarchy:
This pattern applies at every level of the ProbOS ecosystem:
- Ship level — Captain commands the Bridge crew. Bridge issues orders to Department Chiefs. Chiefs orchestrate their specialists
- Federation level — Fleet Admiral (creator) sets fleet-wide policy. Each ship's Captain operates autonomously within those policies. Ships exchange Bridge reports via federation gossip
- The Nooplex — Emergent meta-hierarchy across the fleet. No central coordinator — hierarchy emerges from trust and performance, same as within a single ship
The key insight: the same trust/Hebbian/consensus mechanisms that govern individual agents also govern ships in the fleet. A ship that consistently produces good results earns higher fleet trust. A ship whose Captain makes poor decisions loses fleet standing. The hierarchy is fractal — self-similar at every scale.
Ship's Computer (Runtime Services)¶
Not a team — shared infrastructure that all teams use:
- CodebaseIndex — structural self-awareness, the ship's technical manual (Phase 29c)
- Knowledge Store — long-term memory, the ship's library
- Episodic Memory + Dreaming — experiential learning, the ship's log. Three-tier dreaming model (AD-288): micro-dreams (continuous, every 10s during active sessions), idle dreams (after 120s idle), and shutdown dreams (final consolidation flush)
- Decision Cache — LLM reasoning cache inside CognitiveAgent (AD-272). Identical observations skip LLM re-evaluation. Future: feedback-driven cache eviction, KnowledgeStore persistence for warm boot
- Cognitive Journal — complete token ledger recording every LLM request/response with full context for replay, analysis, and learning (Phase 32)
- Ship's Telemetry — internal performance instrumentation: LLM call timing, pipeline duration, token metering, build path comparison. The sensor grid that Cognitive Journal, EPS, and Observability Export all read from (Phase 32)
- Model Registry — catalog of available model providers with neural routing via Hebbian learning (Phase 32)
- Trust Network — reputation system, crew performance records
- Profile Store — crew identity, personality (Big Five), rank, performance reviews (AD-376)
- Intent Bus — internal communications, the ship's intercom (with priority levels and back-pressure — Phase 33)
- Ward Room — direct agent-to-agent messaging, the officers' private channel (Phase 33)
- Hebbian Router — navigation, learned routing pathways (extended for model routing — Phase 32)
- Alert Conditions — ship-wide operational modes that change system behavior simultaneously (Phase 33)
- Structural Integrity Field — proactive invariant enforcement, continuous runtime health assertions (Phase 32)
- EPS (Compute/Token Distribution) — LLM capacity budgeting and allocation across departments (Phase 33)
Shared Cognitive Fabric Principle (AD-393)
"The Enterprise has one computer — not one per crew member."
Within a ship, agents share centralized Ship's Computer services rather than maintaining per-agent micro-datastores. Each agent has scoped records within the shared services — like shards in a platform — not separate databases. This is the same pattern used by enterprise platforms (D365, Salesforce): one database, many tenants, each with their own data.
| Service | Shared Infrastructure | Per-Agent Scoped Data |
|---|---|---|
| ProfileStore | One SQLite database | Individual personality traits, rank, reviews |
| TrustNetwork | One trust graph | Individual trust scores, alpha/beta params |
| EpisodicMemory | One memory store | Individual episode histories |
| KnowledgeStore | One knowledge base | Individual learned facts |
| HebbianRouter | One routing mesh | Individual routing weights per intent |
| DirectiveStore | One directive registry | Individual standing orders, learned lessons |
Why this is correct: - Enables cross-agent queries (Counselor comparing cognitive profiles, Captain reviewing crew health) - Prevents micro-datastore proliferation (55 agents = 55 SQLite files without this) - Maintains clean separation of concerns (infrastructure vs. data) - Matches the federation boundary: shared within a ship, sovereign between ships
Why this is NOT a hive mind: - Each agent's data evolves independently based on their own experiences - One agent's personality change does not cascade to others - Shared infrastructure ≠ shared consciousness — the filing cabinet is shared, the personnel files inside are individual - Federation gossip exchanges metadata (trust scores, capabilities), not personality or memories
Alert Conditions (Red / Yellow / Green)
"All hands, battle stations."
A starship shifts its entire operational posture based on situation. ProbOS should do the same. A single runtime flag that propagates configuration changes across all departments simultaneously:
| Condition | Trigger | Behavior Changes |
|---|---|---|
| Green | Normal operations | Full dreaming, standard consensus thresholds, background maintenance active, all departments at normal allocation |
| Yellow | Anomaly detected, elevated risk | Heightened monitoring, suppress non-essential dreams, tighter logging, Counselor runs cognitive wellness sweep, pre-stage damage control procedures |
| Red | Critical incident, active crisis | All compute to active crisis, lower consensus quorum for faster response, wake dormant specialists, pause background maintenance, Captain alerted immediately |
- Set by: Captain (manual), VitalsMonitor (threshold triggers), Security (threat detection)
- Propagation: Runtime broadcasts
alert_condition_changedto all pools. Each agent type defines its own response to alert levels - Auto-downgrade: Red → Yellow after crisis resolved (with Captain confirmation). Yellow → Green after anomaly cleared
- Logging: All alert transitions recorded in Cognitive Journal with triggering reason
Structural Integrity Field (SIF)
"Structural integrity at 47% and falling!"
Medical detects damage. The SIF prevents structural failure. Continuous proactive invariant checking that catches corruption before it manifests as a Medical alert:
- Trust bounds — trust scores stay within [0.0, 1.0], no NaN/infinity
- Pool consistency — no orphaned agents, pool membership matches registry, target sizes respected
- Configuration validity — all config values pass schema validation, no missing required fields
- IntentBus coherence — routing tables have no dangling references, all subscribed agents exist
- Index consistency — CodebaseIndex entries reference files that exist on disk
- Memory integrity — episodic memory and knowledge store indexes are readable and non-corrupted
- Hebbian weight bounds — no weight explosion or collapse (weights within reasonable range)
Implementation: lightweight runtime service running on every heartbeat cycle (5s). Not an agent — a Ship's Computer function. Violations trigger Yellow Alert before damage propagates. Each check is a simple assertion, not an LLM call. SIF health percentage reportable to HXI.
Capability Tiers (Crew, Instruments, Knowledge)¶
ProbOS has three tiers of capability, modeled after a starship crew:
Agents (Crew) → who decides what → crew members who think and collaborate
Tools (Instruments) → what you can do → tricorder, transporter, phaser
Skills (Knowledge) → what you know → ship's library, reference data
| Tier | Star Trek Analog | ProbOS | Governance | Examples |
|---|---|---|---|---|
| Agent | Crew member (Crusher, Worf) | Intent handler with full lifecycle | Trust, Hebbian, consensus, Shapley | DiagnosticianAgent, SurgeonAgent |
| Tool | Tricorder, transporter, phaser | Typed callable function, shared across agents | Tool-level trust tracking, no per-call consensus | File read/write, HTTP fetch, API calls, MCP tools |
| Skill | Ship's library, computer database | Read-only data access attached to agents | None (internal) | codebase_knowledge, search indexes |
When to use each: - Agent — handles a user intent, needs to decide/reason, should participate in trust and Hebbian routing - Tool — performs a specific action, any authorized agent can use it, doesn't need consensus for each call - Skill — provides data access internally, no behavior, read-only
Tools are the natural mapping target for MCP — external MCP tools become ProbOS tools, and ProbOS tools are exposed as MCP tools to external systems.
Tool provenance — aligning with GitHub Copilot, Claude Code, and VS Code. Beyond the functional ToolType taxonomy (AD-422, what a tool does), tools are also classified by source, matching the harness conventions developers already know: built-in tools (ship with ProbOS — file I/O, shell, codebase search, and the governed run_python / http_fetch mesh capabilities), MCP tools (from MCP servers, local or federated — AD-449/480a), and extension tools (contributed through the sealed-core extension path, AD-481, including self-designed agents and skills). This source axis is surfaced per-agent in the Service Configuration hub and ship-wide in the Ship's Locker (Agent Customizations epic #944).
The Federation¶
"Cooperation at scale — across agents and humans together."
Each ProbOS instance is a ship. Multiple instances form a federation. But the federation extends beyond ProbOS — any capable agent, regardless of origin, can join the crew. There will always be a better agent somewhere. The strategy is cooperation, not competition: federate with the best, wherever they are.
ProbOS's value isn't any single agent's capability — it's the orchestration layer: trust network, consensus, Hebbian routing, escalation, and the human approval gate that makes diverse agents work together better than any of them alone. A single officer is skilled, but a well-run ship with a diverse crew accomplishes more. The Enterprise's strength wasn't one species — it was Vulcan logic alongside Betazoid empathy alongside Klingon tenacity alongside android precision. Different cognitive architectures, unified by trust and shared mission. ProbOS applies the same principle to AI: Claude's reasoning, GPT's generation, Copilot's search, open-source models' cost efficiency — each brings what the others lack. The trust network and consensus layer turn that diversity into strength. ProbOS is the ship that takes you to the Nooplex — human-agent cooperation at scale.
| Star Trek Concept | ProbOS Equivalent | Status |
|---|---|---|
| Starship | Single ProbOS instance | Built |
| Ship departments | Agent pools (crew teams) | Built |
| Chain of Command | Rank structure — Fleet Admiral → Captain → Bridge → Chiefs → Crew | Built (AD-398/440/477/595/674) |
| Ship's computer / LCARS | Runtime + CodebaseIndex + Knowledge Store + Cognitive Journal | Built |
| Internal sensors | Ship's Telemetry — LLM timing, token metering, pipeline comparison | Built (AD-461) |
| Alert Conditions (Red/Yellow/Green) | Ship-wide operational modes — resource/consensus/dream behavior changes | Built (AD-503/506/695) |
| EPS (Power Distribution) | Token/compute budget allocation across departments | Built (AD-469) |
| Structural Integrity Field | Proactive runtime invariant enforcement | Roadmap (#475 AD-699) |
| Multi-Level Diagnostics (L1–L5) | Formalized diagnostic depth for Medical team | Roadmap (#476 AD-700) |
| Damage Control Teams | Engineering rapid-response automated recovery | Built (AD-457) |
| Navigational Deflector | Pre-flight validation before expensive operations | Built (AD-458) |
| Saucer Separation | Graceful degradation when critical systems fail | Built (AD-459) |
| Transporter | Transporter Pattern — parallel code generation (AD-330–336) | Complete |
| Federation | Federated ProbOS instances | Built (Phase 29) |
| Visiting officers | External AI tools (Claude Code, Copilot, etc.) | Partial (MCP bridge AD-449 ✅; formal registration #477 AD-701) |
| Diplomatic relations | Trust transitivity between nodes | Roadmap (#478 AD-702) |
| Shared intelligence | Knowledge federation + Model of Models | Partial (AD-687 Knowledge Edge Store ✅; cross-instance sync AD-693 (Commercial)) |
| Prime Directive | Safety constraints, boundary rules, human gate | Built |
| Starfleet Command | Fleet Admiral (creator) — fleet-wide policy across all instances | Long Horizon (#479 AD-703) |
| Universal Translator | Channel adapters — Discord, Slack, Telegram, WhatsApp, Matrix, Teams | Partial (Discord/Slack/Webhook ✅ AD-472; remaining 4 in #480 AD-704) |
| Subspace Communications | Voice interaction — STT, TTS, wake word, continuous talk | Partial (substrate ✅ AD-474; backends in #481 AD-705) |
| PADD (Personal Access Display Device) | Mobile companion — PWA, push notifications, responsive HXI | Partial (PWA + push ✅ AD-473; responsive HXI + mDNS in #484 AD-708) |
| Browser Tool (Computer Use) | Agent-driven Chromium via Playwright — 10-action vocabulary, indexed-element state, XGA screenshots, tier-3 Captain-ACK gate | Built (#482 AD-706) ✅ |
| Holodeck Simulations | Agent training environments — scenario simulation, promotion tests, skill acquisition | Built (AD-486/510/539b) |
| MemoryForge | Ship's Computer service — implanted birth memories, memory transfer, curated memory banks | Long Horizon (#485 AD-709) |
| Cognitive Evolution | Transfer learning, proactive initiative, service modeling, trend analysis, gap prediction | Built (AD-507/509/628/660/666/668-672) |
| Workflow Templates | Reusable multi-step pipelines — cron, webhooks, workflow API | Partial (WorkflowCache ✅ AD-580; triggers in #483 AD-707) |
| Drydock | Distribution — PyPI, Docker, onboarding wizard, quickstart | Built (AD-465/484) |
| Modular Construction | Extension-first architecture — sealed core, plugin extensions, graduated autonomy | Built (AD-481) |
| Agent Plugins / Capability Packs | Cross-tool customization bundles (skills + agents + hooks + MCP servers + mesh-intent grants) in the shared Copilot / Claude plugin.json format — consume existing IDE plugins, install from Git/source, governed by consensus + trust + the Captain gate |
Roadmap (#948 AD-1003); curated marketplace (Commercial) |
| Ready Room | Captain's strategic planning — idea capture, multi-agent sessions, architecture hierarchy | Built (AD-475) |
| Utopia Planitia | Specialized builders — backend, frontend, test, infra, data | Built (AD-476) |
| Captain's Yeoman | Personal AI assistant — conversational front door, crew delegation, personalization | Roadmap (#486 AD-710) |
| The Nooplex | Distributed meta-intelligence — Model of Models | Long Horizon |
Build Phases¶
| Phase | Title | Crew Team | Goal |
|---|---|---|---|
| 24 | Channel Integration | Comms | Discord, Slack, Telegram, WhatsApp, Matrix, Teams, webhook adapters + mobile companion (PWA), voice interaction (STT/TTS/wake word) |
| 25 | Persistent Tasks | Ops | Long-running autonomous tasks with checkpointing, browser automation (Playwright), cron scheduling, webhook triggers |
| 25b | Tool Layer | Ship's Computer | Typed callable instruments (tricorders) shared across agents, ToolRegistry, MCP mapping |
| 26 | Inter-Agent Deliberation | Bridge | Structured multi-turn agent debates, agent-to-agent messaging, interactive execution |
| 28 | Meta-Learning & Cognitive Evolution | Science | Workspace ontology, dream cycle abstractions, session context, goal management, multi-dimensional reward signals (quality/efficiency/novelty), hindsight experience replay (dream-driven failure analysis → Standing Orders amendments), emergent capability profiles (dynamic skills from demonstrated success), semantic Hebbian generalization (embedding-based routing, not string matching) |
| 29 | Federation + Emergence | Comms | Knowledge federation, trust transitivity, MCP adapter, A2A adapter, TC_N measurement |
| 29b | Medical Team | Medical | Vitals monitor, diagnostician, surgeon, pharmacist, pathologist, multi-level diagnostics (L1–L5) |
| 29c | Codebase Knowledge | Ship's Computer | Structural self-awareness — indexed source map + introspection skill |
| 30 | Self-Improvement Pipeline | All Teams | Extension-first architecture (sealed core, open extensions, graduated autonomy), capability proposals, stage contracts, QA pool, evolution store, human gate, evergreen updates |
| 31 | Security Team | Security | Formalized threat detection, prompt injection scanner, trust integrity monitoring, secrets management, runtime sandboxing, network egress policy, inference audit, data governance |
| 32 | Engineering Team | Engineering + Ship's Computer | Automated performance optimization, maintenance agents, build agents, LLM resilience, model diversity & neural routing, cognitive journal, ship's telemetry (internal performance instrumentation), observability export, CI/CD, backup/restore, storage abstraction layers, containerized deployment, confidence communication, adaptive communication style, decision audit trail, structural integrity field, damage control teams, navigational deflector, saucer separation |
| 33 | Operations Team | Ops + Bridge | Formalized resource management, workload balancing, system coordination, LLM cost tracking, ward room, priority & back-pressure, self-claiming task queue, competing hypotheses, file ownership, bridge alerts, workflow definition API, chain of command (bridge crew, department chiefs, promotion mechanics, rank structure), Ship's Counselor (cognitive wellness, Hebbian drift detection, relationship health), alert conditions (Red/Yellow/Green), EPS (token/compute distribution), earned agency (trust-tiered self-direction: Ensign→Lieutenant→Commander→Senior Officer, self-originated goals, curiosity-driven exploration, decreasing oversight with increasing trust), tournament evaluation (competitive agent selection, loser-studies-winner), memetic evolution (cross-agent knowledge transfer, successful strategies propagate through crew), the conn (temporary authority delegation, OOD protocol, scoped autonomous operation), night orders (captain-offline guidance, time-bounded directives, escalation triggers), watch bill (duty rotation, cognitive fatigue prevention, continuity handoff), external participant bridge (external tools like Claude Code as Ward Room participants — callsign, routing, chain-of-command subordination; enables architect→crew direct communication, build prompt review, code/test verification, crew learns from architect feedback via episodic memory; force multiplier for self-mod pipeline) |
| 34 | Mission Control | Bridge + Comms | Agent activity dashboard, real-time task visibility, approval panels, system health orbs, Captain's Ready Room (idea capture, multi-agent strategy sessions, architecture hierarchy, idea→spec pipeline), specialized builders (backend/frontend/test/infra/data) |
| 35 | User Experience & Adoption | All Teams | PyPI packaging, onboarding wizard, quickstart docs, probos doctor, probos demo mode, comparison docs |
Currently Pending (OSS)¶
For full historical context (team details, completed phases, AD descriptions for shipped work), see roadmap-era-5-completed.md.
Backlog (queued, awaiting wave-plan slot)¶
From the original AD backlog:
| AD | Title | Issue |
|---|---|---|
| AD-495 | Counselor Auto-Assessment on Circuit Breaker Trip | #474 |
| AD-581 | Hybrid Dispatch — Chain-of-Command Direct Tasking (parent) | #468 |
| AD-581a | DepartmentDispatcher — routing decision layer | #469 |
| AD-581b | Agent Order Protocol — accept/decline/refuse semantics | #470 |
| AD-581d | Routing Confidence Threshold | #471 |
| AD-594b | Crew Consultation Primitive — consult(question, context) |
#472 |
| AD-594d | Delivery Pipeline — markdown→PDF, structured→reports | #473 |
From the 2026-05-08 Federation table audit:
| AD | Title | Issue | Priority |
|---|---|---|---|
| AD-699 | Structural Integrity Field (SIF) | #475 | 2 |
| AD-700 | Multi-Level Diagnostics (L1–L5) | #476 | 3 |
| AD-701 | Visiting Officers — formal external-participant Ward Room registration | #477 | 2 |
| AD-702 | Diplomatic Relations — trust-transitivity computation | #478 | 3 |
| AD-703 | Starfleet Command — fleet-wide policy distribution | #479 | 4 (Long Horizon) |
| AD-704 | Universal Translator — Telegram/WhatsApp/Matrix/Teams adapters | #480 | 3 |
| AD-705 | Voice Stack Backends — Whisper/Deepgram/Coqui/Porcupine | #481 | 3 |
| AD-707 | Workflow Triggers — cron + webhook + workflow API | #483 | 3 |
| AD-708 | PADD — responsive HXI + mDNS auto-discovery | #484 | 4 |
| AD-709 | MemoryForge — implanted birth memories | #485 | 5 (Long Horizon) |
| AD-710 | Captain's Yeoman — personal AI assistant | #486 | 4 |
Desktop Management Console (OpenClaw-pattern, applied to a multi-agent civilization):
| AD | Title | Issue | Priority |
|---|---|---|---|
| AD-840 | HXI Skill Registry + per-agent ACM skill-assignment surface | #819 | 3 |
| AD-841 | Desktop management-console parity (Electron tray app expansion) | #820 | 3 |
| AD-842 | Per-agent tool grants in the ACM profile | #821 | 3 |
| AD-843 | DeviceNodeAdapter — trust/consensus-gated remote device actuation (brain→limb tier) | #822 | 3 |
| AD-844 | Mobile / Compact-Yeo device-node client (pairs to a home mesh as a device node) | (forward marker, downstream of AD-843) | 4 |
Crew Autonomy northstar — give commands by conversation or assignment, work tracked on the kanban board, agents unblock themselves by requesting grants/skills/builds (2026-06-03 decomposition). Highest committed AD at authoring: AD-839.
| AD | Title | Epic | Priority |
|---|---|---|---|
| AD-845 | Yeo creates a dispatchable task from a 1:1 chat reply (supersedes BF-599b/D2) | Yeo async task workflow | 2 |
| AD-846 | Task completion → proactive Yeo DM to the Captain (1:1 + Ward Room) | Yeo async task workflow | 2 |
| AD-847 | Desktop OS notification on completion, click opens Yeo chat | Yeo async task workflow | 3 |
| AD-848 | Kanban auto-refresh on WORK_ITEM_STATUS_CHANGED (forward marker) |
Yeo async task workflow | 4 |
| AD-849 | In-HXI "Yeo is working on N tasks" ambient indicator (forward marker) | Yeo async task workflow | 4 |
| AD-850 | Captain asks Yeo task status in chat — read-back from work_item_store (forward marker) | Yeo async task workflow | 4 |
| AD-851 | Readability content extractor for PageReaderAgent (replace regex strip) | Web research depth | 2 |
| AD-852 | Web-research tool loop — AgenticLoop driven over mesh web intents w/ provenance | Web research depth | 2 |
| AD-853 | Unified CapabilityRequest model + single approval queue (generalizes vision/tool/build/extension proposals) |
Crew self-unblock | 1 |
| AD-854 | Acquire-vs-build triage router — grant → install skill → build, mapped to the three governance axioms | Crew self-unblock | 1 |
| AD-855 | BLOCKED → request → approve → resume work-item loop driver | Crew self-unblock | 1 |
| AD-856 | AgenticLoop as the executor for dispatchable work items (bridge) |
Crew self-unblock | 1 |
| AD-857 | Capability-request HXI/chat decision surface (alert-driven, dual-surface approve) | Crew self-unblock | 2 |
| AD-858 | LLM-driven semantic plan decomposer (single goal → sub-task DAG, plugs into PlanDecomposer) |
Crew collaboration | 1 |
| AD-859 | Crew fan-out executor — each subtask runs the AD-856 AgenticLoop, results collected with persistent-agent provenance | Crew collaboration | 1 |
| AD-860 | Adversarial verification + convergence gate — independent semantic verifier refutes until results converge | Crew collaboration | 1 |
| AD-861 | Result synthesis + Shapley attribution → parent completion (records trust + episode for the collaboration) | Crew collaboration | 1 |
| AD-862 | HXI crew-collaboration surface — live fan-out/verify/converge on the canvas (forward marker, dual-surface) | Crew collaboration | 3 |
Crew collaboration northstar — one hard goal → decompose → fan out across persistent trusted crew → adversarial-verify until converge → synthesize one completion with per-agent attribution (2026-06-03). Depends on AD-856; reuses ParallelDispatcher (AD-594c), RedTeamAgent pattern, compute_shapley_values. Spec: prompts/ad-858-crew-collaboration-task-completion.md. Highest reserved AD at authoring: AD-857.
Ad-hoc crew collaboration (group chat → meeting) northstar — start a 1:1, add crew, name the room, share files; crew spin up their own named group chats when collaborating on a task; the Captain sees and joins them; then transition a chat into a live "meeting" with per-agent voice and a 3D avatar gallery (a group video call). Microsoft-Teams group-chat-and-meeting semantics, not a Reddit forum (the Ward Room already covers the public/browsable/endorsable surface) (2026-06-07). Conversational counterpart to the Crew-collaboration task spine (AD-858–862) — cross-links via the existing chat_threads.task_id. Reuses the routers/chat.py @-mention fan-out, content-addressable AttachmentStore (AD-720/731, refs-not-blobs), thread-priority (AD-641c), convergence (AD-583), and the @-picker (AD-719c). RULED (Captain, 2026-06-07): substrate is ChatThreadStore (AD-791, participants first-class) — DM/huddle-style, NOT the Ward Room forum — advancing the AD-574c-i one-conversation-store direction. Folds in the dormant AD-719a-wire (persist + cross-agent prompt injection) and lifts the AD-719a-2 deferral (agent-to-agent without a Captain seed). Highest committed AD at authoring: AD-912.
Phase 1 — group chat (text):
| AD | Title | Epic | Priority |
|---|---|---|---|
| AD-913 | Chat-thread participant management — add_participant/remove_participant on ChatThreadStore + POST/DELETE /api/threads/{id}/participants (foundation: "add crew to a 1:1" AND "Captain joins") |
Ad-hoc group chat | 1 |
| AD-914 | Group-chat fan-out + cross-agent visibility — a ≥2-agent thread fans a Captain turn to all participants, injects recent thread history into each agent's prompt (so they see each other), persists replies as chat_thread_messages (the ChatThreadStore form of AD-719a-wire) — SHIPPED 2026-06-07 gate-verified |
Ad-hoc group chat | 1 |
| AD-915 | Turn-taking facilitator — relevance-ranked speaking order + convergence-gated termination (NEW pure ChatFacilitator mirroring AD-641c's pattern + the AD-583/AD-614 Jaccard primitive) for crew-scale flood control; the shared sequencer for text AND meeting voice — SHIPPED 2026-06-07 gate-verified |
Ad-hoc group chat | 1 |
| AD-916 | File sharing in chat — attachment refs on chat_thread_messages via AttachmentStore (AD-731 refs-not-blobs); vision-capable agents receive image refs — SHIPPED 2026-06-07 gate-verified |
Ad-hoc group chat | 2 |
| AD-917 | UI group-chat experience — rename affordance, add-participant @-picker (reuse AD-719c), participant avatars, file attach/drop (reuse the Ward Room attach surface) — SHIPPED 2026-06-07 gate-verified | Ad-hoc group chat | 2 |
| AD-918 | Agent-initiated group chats — a create_group_chat intent so crew open + name a chat and add collaborators while working a task (links via chat_threads.task_id); tagged metadata.created_by_agent (lifts the AD-719a-2 deferral) — SHIPPED 2026-06-07 gate-verified |
Ad-hoc group chat | 2 |
| AD-919 | Group-chat visibility surface + join — a focused GroupChatListPanel (TopNav toggle, NotebooksPanel precedent — NOT the full LeftRail wire, which stays forward marker AD-719b-parent-wire) listing agent-created + Captain group chats, badging agent-created ones, with a Join button (→ AD-913 add_participant("captain")) and open-on-click via threadIdByAgent + openAgentProfile; frontend-only (no backend change) — SHIPPED 2026-06-07 gate-verified |
Ad-hoc group chat | 3 |
| AD-931 | Unified CHATS surface — repurpose (move + rename) the AD-919 GroupChatListPanel into one Teams/Slack-style CHATS panel (chats/ChatsPanel.tsx) listing BOTH 1:1 and group conversations, with a prominent "+ New chat" picker; closes the "no way to start a chat" gap. — SHIPPED 2026-06-08 gate-verified (a new pure isChat widens isGroupChat → !task_id && (isAgentCreated || crewParticipantIds >= 1), admitting the get_or_create_default_for_agent 1:1 default threads — GET /api/threads has no participant filter, so frontend-only, no backend — while excluding AD-925 task rooms (task_id); 1:1 rows show a single avatar + callsign + no Join, group rows keep the AD-919 multi-avatar/badge/Join treatment, both open via the unchanged setThreadForAgent(host) + openAgentProfile(host) pattern. NewChatModal reuses the AD-917 AddParticipantPopover (accumulate onAdd→selected[], feed back as existingParticipantIds, removable chips, Start disabled until ≥1) and branches on count: 1 agent → openAgentProfile (NO createThread — server owns the 1:1 default, avoiding a divergent duplicate); 2+ → createThread({title, participants}) + open host (honest-degrade null → modal stays open). Store flag groupChatListOpen→chatsOpen/openChats/closeChats; TopNav GROUP CHATS (group-chats-toggle)→CHATS (chats-toggle); LeftRail.tsx untouched. NO Ward Room DM convergence — the two DM systems (Ward Room chain pipeline vs 1:1 Chats one-shot pipeline) stay a deliberate A/B test (AD-574c-i untouched). Supersedes the 10-test AD-919 panel test (deleted). +25 Vitest (net ≈ +15), build clean. Forward marker: AD-932 discoverable in-chat add-people on an empty 1:1) |
Ad-hoc group chat | 3 |
| AD-932 | Discoverable "+ Add people" on an empty/fresh 1:1 chat — a small EmptyChatAddPeople affordance mounted by ProfileChatTab only when !activeThreadId (the empty-1:1 case before the AD-917 GroupChatHeader exists), closing the "no way to add a 2nd crew member on a fresh 1:1" gap. — SHIPPED 2026-06-08 gate-verified (the AD-917 header + its AddParticipantPopover mount only once a thread exists, but a brand-new 1:1 has no thread until the first message is sent → no header → no add control; EmptyChatAddPeople self-gates on agent.isCrew and on click materializes the 1:1 thread via createThread({title: callsign, participants: [agentId]}) then setChatThread(t) + setThreadForAgent(agentId, t.id) so activeThreadId resolves and the existing GroupChatHeader (+ its picker) takes over for the actual add — a materialize-then-hand-off, mutually exclusive with the header and disjoint from the AD-929 Files rail (showWorkspaceFiles is false when !activeThreadId). Honest-degrade: createThread null → no store write, button stays. Independent of AD-931 (builds on AD-917/threadApi); NO GroupChatHeader change, NO backend, NO Ward Room, NO Glyphs.tsx exports (reuse UserPlus), NO LeftRail wire. +8 Vitest (EmptyChatAddPeople.test.tsx, component-isolation per the groupsend/bf294b precedent — no heavy ProfileChatTab render); full Vitest 1270 passed | 1 skipped (215 files; baseline 1262 + 8, zero regressions); npm run build clean. Forward marker: AD-932a auto-open the picker on materialize) |
Ad-hoc group chat | 3 |
| AD-933 | Group-chat escalation ladder — wire the AD-726 one-shot post-LLM escalation subset into the AD-914 group-chat fan-out so a group reply can fire the channel-agnostic steps (inline mesh read, [ACTION], notebook, artifacts, [CREATE_TASK]), not just a Tier-1 reply. — SHIPPED 2026-06-08 gate-verified (the 1:1 /chat path and the group fan-out are structurally identical up through runtime.intent_bus.send, but thread_fanout._send_one took result.result RAW and never built the pipeline → group chat was Tier-1-only: no inline mesh read, no [ACTION], no [CREATE_TASK]. AD-933 refactors DmReplyPipeline.run()'s 17-step tuple into _full_steps() + _run_steps(steps) (run() byte-identical: same 17 steps, same order, same per-step Tier-2 guard) and adds _escalation_steps() + public run_escalation_only() = the 5 channel-agnostic steps (4e action-dispatch, 4i notebook, 4h mesh-read, 4f artifacts, 4g create-task) in run()-order; _send_one resolves dm_sanity_gate once before the gather and, only when a real result.result came back AND the agent resolved, runs run_escalation_only() then persists the (possibly mutated) reply — Tier-2 honest-degrade ships the raw reply on any failure. The full pipeline is rejected for group because step_5_episodic_store hardcodes session_type:"1:1" → firing it on a multi-agent turn writes mislabeled episodes (a learning-loop regression); the excluded 12 steps are all 1:1/avatar-scoped. NO facilitator / IntentMessage / vision / DmReplyContext / step-body / AD-632-chain / AD-925 task-room / Ward Room change. +8 pytest (test_ad933_group_chat_escalation.py, BF-287 real WorkItemStore+ChatThreadStore+IntentBus+dm_sanity_gate); focused gate 8, blast-radius -k "thread or chat or fanout or dm or reply or pipeline or create_task or escalat" 1164 passed / 1 skipped (the test_bf296 inspect.getsource(run) structural guard re-pointed at _full_steps). Forward markers: AD-933a group-anchored episodic write, AD-933b richer subset, AD-934 in-chat [THINK]/[DELIBERATE] via the dormant AD-632 escape hatch) |
Ad-hoc group chat | 3 |
| AD-933a | Group-anchored episodic write for the fan-out — every group-chat fan-out reply now persists one channel="chat" / session_type:"group" episode tagged with the thread id, closing the learning-loop hole AD-933 left open. — SHIPPED 2026-06-08 gate-verified (the fan-out sends direct_message with params["from"]="hxi_profile", which the agent safety-net _store_action_episode skips — deferring to the pipeline's step_5_episodic_store — and AD-933 deliberately excluded step_5 from the group subset because it hardcodes session_type:"1:1"/channel:"dm" and would mislabel a multi-agent turn; net result: group replies wrote no episode at all → no episodic recall, no dreaming consolidation, no wellness analysis. AD-933a adds a Tier-2 episodic-write loop at the END of group_chat_fanout, mirroring the AD-719 @-mention fan-out (routers/chat.py) but group-anchored: episodic_memory = getattr(runtime, "episodic_memory", None) (skip if None), participants resolved once as ["captain"] + [callsign-or-id per crew speaker], then per reply — skipping sentinels ((no response) / (delivery failed) / empty) — build the episode (runtime.dream_adapter.build_episode(...) when present, else a direct Episode(...) with AnchorFrame(channel="chat", trigger_type="group_fanout", participants=…, chat_thread_id=thread_id), source="group_chat_fanout", outcomes=[{"intent":"direct_message","success":True,"session_type":"group",…}]) and await episodic_memory.store(episode) inside a try/except that logs and continues — never raises, so the fan-out always returns all replies. NO change to _send_one's return shape, the DmReplyPipeline, _store_action_episode, step_5, the 1:1 path, the facilitator, IntentMessage, or the Ward Room — additive, group-only. +6 pytest (test_ad933a_group_episode.py, BF-287 real ChatThreadStore+IntentBus+a real-but-fake recording episodic_memory, no dream_adapter so the group-anchored fallback runs); focused gate 6, blast-radius -k "thread or chat or fanout or episod or dream" 1285 passed / 1 skipped (the AD-933 test_group_path_writes_no_1to1_episode obsolete-contract guard re-pointed from "no episode at all" to "group episodes written, none 1:1-labelled"). Forward markers: AD-933b richer escalation subset, AD-934 in-chat [THINK]/[DELIBERATE]) |
Ad-hoc group chat | 3 |
| AD-933b | Group-chat image generation — add step_4c_image_gen_parse (AD-730-3 [GEN_IMAGE prompt]) to the AD-933 channel-agnostic escalation subset so a crew agent can generate an image inside a room, and surface the generated SHA refs onto the persisted group message metadata. — SHIPPED 2026-06-08 gate-verified (step_4c was already 1:1-safe and channel-agnostic — it reads ctx.sanity_gate + ctx.runtime.config.avatars.image_gen_*, calls dispatch_image_gen(runtime, agent_id, prompt), appends the SHA to ctx.generated_attachment_ids, and honest-degrades (marker stripped, operator message appended) when the tier is disabled — but the group fan-out returned {agent_id, callsign, text} and never called build_response(), so a group-generated image was created + stored yet invisible. Two changes: (1) reply_pipeline._escalation_steps() adds self.step_4c_image_gen_parse in run()-order (4c precedes 4e), so run_escalation_only() now runs 6 steps; run()/_full_steps() stay byte-identical (4c was already in _full_steps, so the 1:1 path is unaffected). (2) thread_fanout._send_one captures generated_ids = list(pipeline.ctx.generated_attachment_ids or []) from the SAME ctx the AD-933 escalation already built and, when non-empty, extends the append_message metadata {intent_id, fanout:"ad914"} with generated_attachment_ids (AD-916 ref carriage) — Tier-2: an empty/failed escalation leaves the metadata byte-identical to the AD-914 baseline. step_4d_follow_up deliberately NOT added (its conversation_pacing_scheduler re-injects a synthesized user-turn — ambiguous target in a multi-agent room: forward marker AD-933b-2); UI rendering of group-generated images is forward marker AD-933b-3. NO change to build_response, the 1:1 path, DmReplyContext, IntentMessage, the facilitator, the AD-933a episodic write, or the Ward Room — additive, group-only. +5 pytest (test_ad933b_group_image_gen.py, BF-287 real ChatThreadStore+IntentBus+DmSanityGate, dispatch_image_gen monkeypatched at its source module probos.cognitive.image_gen_dispatch); focused gate 5 passed, blast-radius -k "thread or chat or fanout or reply or pipeline or image or escalat" 860 passed / 4 failed / 1 skipped (the 4 fails are the pre-existing test_skill_agent.py::TestSkillPipeline serial full-suite isolation flakes — Event loop is closed cross-test pollution, green 7/7 in isolation, unrelated to AD-933b; the AD-933 test_run_escalation_only_invokes_only_the_subset obsolete-contract guard re-pointed to the 6-step subset). Forward markers: AD-933b-2 in-chat [FOLLOW_UP], AD-933b-3 UI render of group-generated images, AD-934 in-chat [THINK]/[DELIBERATE]) |
Ad-hoc group chat | 3 |
| AD-934 | In-chat [THINK]/[DELIBERATE] deep-tier re-roll — a flag-gated post-LLM pipeline step lets a crew agent flag a turn for deeper reasoning; when config.dm_deliberate.enabled (default OFF) the step makes ONE deep-tier pass that reconsiders + improves the agent's own draft, replacing the reply (marker always stripped, never leaks). — SHIPPED 2026-06-08 gate-verified (Option C, chosen over the AD-632 chain A/B: the agentic loop already serves deep tool-work, so this is the contained, tool-free inline-deliberation increment. Four additive changes: (1) config.DmDeliberateConfig (enabled=False/tier="deep"/max_tokens=800) on SystemConfig after dm_targeted_lookup; (2) dm_sanity_gate _DELIBERATE_RE/_DELIBERATE_STRIP_RE + extract_deliberate/strip_deliberate (mirror the AD-845 [CREATE_TASK] parse/strip); (3) reply_pipeline.step_4j_deliberate_parse — flag-gated, always strips, deep-tier re-roll via llm_client.complete(LLMRequest(...)), Tier-2 honest-degrade keeps the draft on missing-client/empty/raise, NEVER raises — registered in _full_steps() between 4g and 5 (run()=18) and appended to _escalation_steps() after 4g (subset=7); (4) cognitive_agent._conversational_deliberate_protocol teaches [THINK] only when the flag is ON (gap-regex-safe). Default OFF → zero behavior change, zero-config boot byte-identical. NO Option A/B (no AD-632 chain wiring, no _pending_sub_task_chain, no chain templates), NO DmSanityGateConfig / existing-17-step / DmReplyContext / build_response / AD-933a-episodic / AD-933b-ref / facilitator / Ward Room change. +14 pytest (test_ad934_deliberate.py, BF-287 real DmSanityGate+DmDeliberateConfig+a real-but-fake _FakeLLMClient); two obsolete-contract step-membership tests updated (AD-933 17→18 + 6→7, AD-933b likewise). Focused gate 14, blast-radius -k "dm or reply or pipeline or sanity or deliberate or escalat or cognitive_agent or config" 1376 passed / 1 skipped. Forward markers: AD-934a richer teaching + [THINK reason] focus hint, AD-934b HXI "deliberated" indicator) |
Ad-hoc group chat | 3 |
| AD-935 | Agent-to-agent group-chat reactivity (bounded synchronous cascade) — after the Captain round, an agent reply fans to the OTHER crew for up to max_agent_rounds extra rounds so agents react to each other in real time, not only when the Captain posts. — SHIPPED 2026-06-08 gate-verified (the AD-914 fan-out fired ONLY on the Captain turn — an agent question to a peer got no reply until the Captain posted again. Chosen design: BOUNDED SYNCHRONOUS cascade, NOT async — the chat transcript has no live-refresh (no poll, no WS case for chat), so a fire-and-forget cascade would be invisible AND harder to bound; instead the cascade runs synchronously inside the Captain's POST and returns ALL replies across rounds in the existing per_agent_replies (the UI already renders it in order). thread_fanout.group_chat_fanout's single-round core (facilitate → gather(_send_one) → AD-933a episode write) is extracted into _fan_one_round(...); group_chat_fanout becomes round 0 (Captain) + a bounded loop fanning each prior round's agent messages to the OTHER crew (excluding that round's speakers). Guard set: round cap (new GroupChatConfig.max_agent_rounds=2), the AD-915 ChatFacilitator convergence gate (empty speaking_order once the exchange converges — the semantic terminator), [NO_RESPONSE] declines (new in _send_one + a cognitive_agent._conversational_group_chat_protocol teaching hook gated on params["is_group_chat"], gap-regex-safe — also fixes the round-0 literal-[NO_RESPONSE] bug), exclude-prior-speakers, and @-mention bypass. Flag group_chat.agent_reactivity_enabled default False (config.py) + true in system.yaml (AD-925 pattern). Round 0 is byte-identical to AD-914 when OFF (the before=captain_msg.created_at cutoff keeps the Captain turn out of history). NO async/fire-and-forget cascade (forward AD-935a), NO chat WebSocket/poll, NO reply-shape / direct_message / DmReplyPipeline / AD-934-step / AD-933b-ref / IntentMessage / Ward Room change, NO cascade from agent-initiated posts (AD-935b), NO trust/Hebbian from a2a (AD-935c). +13 pytest (test_ad935_group_reactivity.py, BF-287 real ChatThreadStore+IntentBus+scripted direct_message handlers + real GroupChatConfig + recording episodic_memory); existing AD-914/915/916/933/933a/933b fan-out tests stay green (72 passed — byte-identical, no obsolete-contract updates needed). Focused gate 13, blast-radius -k "thread or chat or fanout or facilitat or convergence or reply or pipeline or group or cognitive_agent" 1025 passed / 1 skipped (the 4 test_skill_agent.py::TestSkillPipeline fails are the pre-existing serial-isolation flakes — green 26/26 in isolation). Forward markers: AD-935a async/streaming once a live-refresh exists, AD-935b cascade on agent-initiated posts, AD-935c trust/Hebbian from a2a) |
Ad-hoc group chat | 3 |
| AD-936 | Per-message chat metadata — author avatar (AgentAvatarBadge) + callsign label + dim local HH:MM timestamp above each transcript bubble, Teams/Slack/Discord-style; group replies thread the author's agent_id/callsign so each shows its own avatar. — SHIPPED 2026-06-08 gate-verified (frontend only, additive. Timestamp was pure render — AgentProfileMessage.timestamp already existed (types.ts:320); avatars needed author identity: AgentProfileMessage gains optional authorId?/callsign?, useStore.addAgentMessage gains an optional 4th opts param (conditional-spread, ~10 call sites unchanged), and the ProfileChatTab group-reply add threads {authorId,callsign} + drops the callsign: text prefix. Heavy bubble JSX + a pure formatChatTime helper extracted into a new presentational ChatMessageRow (groupsend/bf294b precedent — ProfileChatTab's audio/screen deps make a full jsdom render impractical); avatar department via the ChatsPanel.deptOf runtime cast + '' fallback; host-avatar fallback for authorId-less messages; user/system = timestamp, no avatar (dim-italic system style preserved). NO backend/REST/pytest, NO WebSocket/poll (AD-935a), NO Glyphs.tsx/AgentAvatarBadge API change, NO date-separator headers (AD-936a). +8 Vitest (ProfileChatTab.metadata.test.tsx, BF-287 real store via useStore.setState); full UI suite 1278 passed / 1 skipped (216 files, +8 over the ~1270 baseline, zero regressions), npm run build clean. Forward marker: AD-936a date-separator headers) |
Ad-hoc group chat | 3 |
| AD-937 | Non-destructive 1:1→group transition (fixes the unreachable-1:1 regression) — adding a person to a 1:1 no longer destroys it: the add-people flow branches on thread shape (1:1 → a SEPARATE new group via createThread; group → addParticipant), and a dedicated activeProfileThreadId override addresses groups without clobbering the host's single threadIdByAgent 1:1 slot. — SHIPPED 2026-06-08 gate-verified (frontend only. TWO-layer bug: Layer 1 — GroupChatHeader.handleAdd→addParticipant→backend add_participant did an in-place RMW that rewrote the 1:1 row INTO a group; Layer 2 — group-open did setThreadForAgent(host, groupId) (ChatsPanel + NewChatModal 2+), clobbering the host's single threadIdByAgent slot so reopening the profile resolved to the group. Fix Part A: NewChatModal gains seedParticipantId? (locked, non-removable host chip); the 1:1 add control (GroupChatHeader ≤1-crew branch + the rewired AD-932 EmptyChatAddPeople) opens the seeded modal so confirming with 2+ mints a SEPARATE group — the 1:1 row is never mutated; the ≥2-crew group path keeps addParticipant. Fix Part B: useStore gains activeProfileThreadId: string|null + openGroupChatThread(host, threadId) (sets the override, NOT threadIdByAgent); openAgentProfile ALSO clears the override; a new pure resolveProfileThreadId(prop, override, map, id) resolves prop > override > per-agent 1:1 at all 3 ProfileChatTab sites; ChatsPanel.handleOpen + NewChatModal 2+ open via the override. Net: groups are addressed by the override; the agent's threadIdByAgent slot is reserved for its 1:1; a roster open clears the override → shows the (un-mutated) 1:1. NO backend / add_participant change (is-default guard = forward marker AD-937a), NO explicit group-name input (NewChatModal auto-titles; AD-937b), NO AD-933/933a/933b/934/935 pipeline / facilitator / Ward Room / IntentMessage / Glyphs.tsx touch, NO removal of add-to-existing-group. +11 Vitest (ad937ThreadResolution.test.ts + seeded-modal/override updates across NewChatModal/ChatsPanel/EmptyChatAddPeople/GroupChatHeader; obsolete setThreadForAgent/materialize-then-mutate contracts updated); full UI suite 1289 passed / 1 skipped (217 files, +11 over the 1278 baseline, zero regressions), npm run build clean. No backend change → no pytest. Forward markers: AD-937a backend is-default add_participant guard, AD-937b explicit group-name input) |
Ad-hoc group chat | 3 |
| AD-938 | Group chat shows the real thread transcript (hydrate-on-open + thread-keyed messages) — opening a group from CHATS now renders the GroupChatHeader (participants + meeting toggle) AND the group's actual message transcript with per-author avatars + HH:MM timestamps, instead of an empty 1:1-looking view. — SHIPPED 2026-06-09 gate-verified (frontend only. TWO root causes: (1) the opened group thread was never hydrated into chatThreads — ChatsPanel.handleOpen (+ NewChatModal 2+) called openGroupChatThread but not setChatThread, so GroupChatHeader (if (!thread) return null), the meetingActive selector, and MeetingView all broke; (2) the transcript rendered the per-AGENT buffer (agentConversations.get(agentId)) and only fetched the agent's 1:1 /chat/history, so a group always showed the host's 1:1. Fix Part 1 (hydrate-on-open): ChatsPanel.handleOpen + NewChatModal 2+ now call setChatThread(thread) (already in the store) before openGroupChatThread. Fix Part 2 (thread-keyed transcript): threadApi.listMessages(id) wraps GET /api/threads/{id}/messages (Tier-2 []); useStore gains a threadMessages: Map<string, AgentProfileMessage[]> slice + setThreadMessages/appendThreadMessage (200-cap); ProfileChatTab adds a load-on-open useEffect([activeThreadId]) that maps DTOs via threadDtoToMessage (captain→user, agent→agent+authorId+callsign, else→system), switches the displayed source via selectTranscriptMessages (thread→thread messages; cold 1:1→per-agent buffer), and reconciles the send path (Captain + per-author replies appended to appendThreadMessage, no callsign: prefix). The 3 pure helpers live in a new audio-dep-free profileTranscript.ts (AD-936 ChatMessageRow precedent). agentConversations untouched. NO backend/REST/pytest (endpoint exists), NO WebSocket/poll (AD-935a), NO Glyphs.tsx/AgentAvatarBadge change, NO agentConversations removal, NO AD-933/934/935 pipeline / facilitator / Ward Room / IntentMessage / MeetingView-Captain (AD-939) / draggable (AD-940) / Playwright (AD-941) touch. +28 Vitest (threadApi.listMessages, store/threadMessages, ProfileChatTab.threadTranscript, ChatsPanel.hydrate, NewChatModal.hydrate — BF-287 real store, mocked threadApi); full UI suite 1317 passed / 1 skipped (222 files, +28 over the 1289 baseline, zero regressions), npm run build clean. No backend change → no pytest. Forward markers: AD-935a live WS/poll push, AD-938a scrollback pagination) |
Ad-hoc group chat | 3 |
| AD-940 | Draggable CHATS panel — the floating CHATS panel can be moved by its header so it never traps an open chat window underneath. — SHIPPED 2026-06-09 gate-verified (frontend only; useStore gains chatsPanelPos {x,y} + setChatsPanelPos mirroring profilePanelPos, init {60,60} = the prior fixed on-screen origin so nothing jumps; ChatsPanel root reads the pos and the CHATS header is a cursor:move drag handle wired with the GamePanel.startDrag pattern (capture offset, window mousemove/mouseup, cleanup), the New-chat / Close controls stopPropagation so a click never drags, and the old bottom:60 pin becomes maxHeight:calc(100vh-120px) so the list stays bounded-scroll while free to move. Kept position:fixed (the GamePanel/AgentProfilePanel precedent + the viewport-coordinate drag math) over the prompt's absolute. NO change to the AD-938 CHATS list/filter/open logic, no resize (forward AD-940a). +5 Vitest (ChatsPanel.drag.test.tsx); full UI suite 1328 passed / 1 skipped (224 files, +11 over the 1317 baseline, zero regressions), npm run build clean. No backend → no pytest) |
Ad-hoc group chat | 3 |
| AD-941 | Playwright e2e regression harness for the chat-collaboration UX — four specs encode the four Captain-reported regressions the mocked-store Vitest suite missed (a group opens as a GROUP with its real transcript / Add-people POSTs BOTH crew / Start-meeting renders the captain + crew slots / the CHATS panel drags). — SHIPPED 2026-06-09 gate-verified (test infrastructure only, no app behavior change, zero data-testid additions — every asserted testid already existed. Deterministic + backend-free: each spec runs against vite dev (:5173), seeds the store through the DEV-only window.__store seam (useStore.ts:2654, Maps rebuilt INSIDE page.evaluate from plain arrays), and mocks the REST surface with one page.route('**/api/**'); @playwright/test devDep + test:e2e/test:e2e:ui scripts + playwright.config.ts (testDir ./e2e, webServer npm run dev). Three test-infra design calls: (1) the catch-all aborts unmatched /api requests rather than fulfilling {}/[] — a wrong-shaped body crashes the full-App mount (VisionBudgetBadge reads calls_today.vision; the dm sidebar .maps /api/wardroom/dms), whereas abort = "no backend" so each component takes its Tier-2 honest-degrade path; (2) gotoApp pre-dismisses the first-run WelcomeOverlay (inset:0; zIndex:100 click-catcher) via hxi_seen_intro localStorage in an addInitScript + setShowIntro(false); (3) meeting-avatars opens its group directly via the store (openGroupChat helper) so the still-open CHATS panel can't overlap the meeting toggle — group-chat-open still drives the panel row-click open path. Kept OUT of Vitest: vitest.config.ts exclude [...configDefaults.exclude, 'e2e/**'] (the default glob would otherwise run the *.spec.ts); e2e/ is outside tsconfig include:["src"] so npm run build (tsc -b) is unaffected. .gitignore += ui/test-results/ ui/playwright-report/ ui/.playwright/. npx playwright test 4 passed; npx vitest run 1328 passed / 1 skipped (unchanged — e2e excluded); npm run build clean. Why e2e: the mocked-store Vitest gap let AD-917/931/937 ship with the runtime bugs AD-937/938/939/940 then fixed. No backend → no pytest) |
Ad-hoc group chat | 3 |
| AD-942 | Clean Teams-style chat names — the CHATS list + GroupChatHeader now name every chat by its PARTICIPANTS (crew callsigns joined), not by message content; agent-initiated chats (AD-918/924) that landed a conversational phrase in thread.title ("Hello Yeo", "Thanks, let me know…") no longer read as noise. — SHIPPED 2026-06-09 gate-verified (frontend only; new pure chatDisplayName(thread, agents) in chatFilters.ts = crew callsigns joined by ", ", UNLESS the Captain explicitly renamed the room (metadata.title_locked via the AD-917 rename / AD-794 set_title(lock=True)), in which case the deliberate title wins; honest-degrade to the stored title then "Chat" when no crew resolves. Wired into both ChatsPanel row titles (1:1 + group) and the GroupChatHeader displayed title + rename pre-fill. NO backend (titles are stored as-is; this is a display convention), no thread.title write change, no AD-938/939/940 touch. +5 Vitest (chatFilters.test.ts); full UI suite 1333 passed / 1 skipped (+5 over the 1328 baseline, zero regressions), npm run build clean, live-verified in the HXI. Forward marker: AD-942a (a subtle last-message preview line under the name, Teams-style)) |
Ad-hoc group chat | 3 |
| AD-943 | Command-Station model + registry (Bridge as Ship's Computer, foundation) — generalize the Bridge's ad-hoc BridgeSection composition into a typed, reusable COMMAND-STATION taxonomy + registry and migrate the 3 existing sections through it (a station = a menu group for one area of the ship, holding launch actions[] + inline config[]). — SHIPPED 2026-06-09 gate-verified (frontend only, additive — NO nav removal: the AD-944 top toolbar / AD-945 bottom-right cluster / AD-946 omnibox are untouched. New ui/src/components/bridge/stations.tsx = the typed CommandStation descriptor ({id, title, accent, defaultOpen, count?, onExpand?, body?, actions[], config[]}), the 6-station taxonomy STATION_META/STATION_ORDER (communications, personnel, science, operations, engineering, command), the buildBridgeStations(ctx) factory, and isPopulated. The 3 sections migrate through the registry: Communications (#b080d0), Work Board → Operations (#d0a030, mainViewer:'work'), System → Engineering (#70a0d0, mainViewer:'system') — accents + onExpand setState targets preserved byte-for-byte, the bodies (services list / THREADS + comms admin / kanban) imported + reused verbatim, only the two headers renamed; personnel/science/command are modelled placeholders (empty actions/config, no body) hidden by isPopulated until AD-944/945/946 fill them. Attention/Active/Notifications/Recent + the Shutdown footer stay a SEPARATE hand-written activity-feed layer (NOT stations — HXI #9; no stationId; the empty-state \"No activity\" drops its kanban term since the Work Board is now an always-rendered station). BridgeSection gains an optional stationId → a data-station attribute + a 2px accent left-edge glanceably distinguishing the command-station layer from the activity-feed layer (HXI #6, accent tokens reused, no new color); stroke-SVG glyphs only, NO emoji (HXI #3); the Ship's Computer is NOT an agent (AD-398); actions: [] everywhere (no open* wiring yet — AD-944). +6 Vitest (bridge/__tests__/stations.test.tsx — pure model cases for the taxonomy/factory/placeholders, a descriptor-holds-a-future-launch case proving the slot is real + callable, and one BF-287 real-store <BridgePanel> render asserting the 3 migrated headers + the data-station hooks + the activity feed + the Shutdown footer + no-emoji; real store via setState, global.fetch stubbed empty); full UI suite 1339 passed / 1 skipped (225 files, +6 over the 1333 baseline, zero regressions), npm run build clean (tsc -b + vite). No backend change → no pytest. Forward marker: AD-943a (the deep geometric Ship's-Computer visual pass)) |
Bridge as Ship's Computer (#873 / #878) | 3 |
| AD-944 | Retire the top "HXI panels" toolbar — migrate its nine launches into Bridge command stations + render the AD-943-modelled StationActions (the FIRST visible change of the epic; the 9-button launcher is gone, those destinations now open from Bridge stations). — SHIPPED 2026-06-09 gate-verified (frontend only; AD-943 modelled the StationAction slot but BridgePanel rendered only body+config — AD-944 adds a co-located StationActionRow (stroke-SVG ChevronRight, uppercase-mono label in the station accent, optional amber unread pill, NO emoji — HXI #3; data-testid={action.id} mirrors the retired toolbar testIds) + a {st.actions.map(...)} render block, and the factory ctx gains totalUnread (from wardRoomUnread). The 9 launches map onto 4 stations (onInvoke via getState(), async void-wrapped): communications → Ward Room (count=totalUnread) + Chats; personnel → Crew, Personnel, Metrics; science → Notebooks, Records, Explorer; command → Settings (the lone useSettingsStore launch). personnel/science/command were AD-943 placeholders → now populated → render (isPopulated unchanged). The Ward-Room dual-affordance (header Expand + labelled row) is intentional — the row carries the live badge. The <div role="toolbar" aria-label="HXI panels"> block in App.tsx (NavButtonProps/NavButton/NavSeparator/TopNav, all dead after removal) is DELETED + replaced by a marker comment; the <TopNav /> mount becomes a position:fixed top-left wrapper re-homing CommercialOverlayBadge (invisible in the default OSS build, must stay mounted). No Bridge open/close side effect; no defaultOpen flip (HXI #5). The 3 AD-943 placeholder/factory cases were rewritten (factory gained totalUnread; empty→populated; the descriptor case invokes the pushed action; comms asserts the 2 launches + the badge) + a new BF-287 real-store <BridgePanel> render case (click-expand each station, resolve the migrated testIds, flip chatsOpen via the sync chats-toggle, no-emoji). NO backend/REST/pytest. +1 Vitest net; full UI suite 1340 passed / 1 skipped (225 files, +1 over the 1339 baseline, zero regressions), npm run build clean (tsc -b + vite), npx playwright test 4 passed (the AD-941 specs open panels via window.__store, never the toolbar → unaffected). Siblings (not built): AD-945 (bottom-right toggle cluster → station config), AD-946 (omnibox command palette)) |
Bridge as Ship's Computer (#873 / #879) | 3 |
| AD-945 | Fold the bottom-right environment toggles into the Engineering station, remove the cluster — the four canvas-overlay toggles (ambient sound / voice output / wake-word / visual legend) relocate out of DecisionSurface into ONE Engineering config ({id:'environment', label:'Environment', render:() => <BridgeEnvironment/>}); the bottom-left status strip (telemetry) + the legend overlay STAY. — SHIPPED 2026-06-09 gate-verified (frontend only, a RELOCATION — the audio/VAD/TTS/wake-word/legend engines are untouched. New ui/src/components/bridge/BridgeEnvironment.tsx holds the four toggles as one vertical Environment group: the four <button>s + their SVGs moved byte-for-byte, each firing the SAME store action with the SAME side effects (sound setSoundEnabled+soundEngine; voice setVoiceEnabled+picker; wake-word setWakeWordEnabled; legend setShowLegend), wake-word keeps data-testid="wake-word-toggle"; the voice-picker dropdown + volume slider re-anchored INLINE (position:relative, were absolute bottom:40 right:60 on the old status bar), a section label + per-row text labels make the 380px panel read better than the icon-only bar, stroke-SVG only, NO emoji (HXI #3). stations.tsx adds {id:'environment', label:'Environment', render:() => <BridgeEnvironment/>} to engineering (isPopulated already true via its <BridgeSystem/> body → no edit; the factory ctx unchanged). DecisionSurface drops the cluster + the orphaned flex:1 spacer + the soundEngine/voice imports + the 7 now-unused selectors + the picker state/effects + btnStyle + the whole react hooks import (keeps showLegend for the overlay; the telemetry strip now left-aligns); the wake-word single owner stays IntentSurface.tsx (AD-705) — no 2nd owner, <WakeWordIndicator/> does not move. Test git mv DecisionSurface.wakeWordToggle.test.tsx → BridgeEnvironment.test.tsx (kept in src/__tests__/ so the vi.mock('../audio/…') ids resolve to the same modules), both renders re-pointed to <BridgeEnvironment/>, the 2 wake-word cases verbatim, +4 cases (sound/voice/legend flips + a no-emoji guard). +4 Vitest → full UI suite 1344 passed / 1 skipped (225 files, +4 over the 1340 AD-944 baseline, zero regressions); npm run build clean (tsc -b + vite); npx playwright test 4 passed (AD-941 specs never touch these toggles). NO omnibox (AD-946), no engine / store-flag / default / localStorage change, no defaultOpen flip, no other-station change, no backend/pytest. COMMITTED LOCAL ONLY — NOT pushed. Forward marker: AD-946 (the Ask ProbOS… omnibox command palette)) | Bridge as Ship's Computer (#873 / #880) | 3 |
| AD-946 | The Ask ProbOS… omnibox becomes the Ship's-Computer command palette — a leading > enters command mode and the matching Bridge station launches surface in a keyboard-navigable dropdown above the input; Enter runs the SAME store action the Bridge fires (the FINAL wave of the epic; the keyboard now drives station launches). — SHIPPED 2026-06-09 gate-verified (frontend only, ADDITIVE. Hard guarantee: a plain NL question — anything NOT starting with > — submits BYTE-FOR-BYTE via /api/chat exactly as today; the palette never opens for it and a > string is never POSTed. Trigger = a leading > (VS-Code command mode), chosen over an always-on dropdown / fuzzy-rank (a substring match under ordinary questions risks an accidental Enter-run); reuses the existing /grant leading-sigil interception + the Spotlight triggerInput/pendingChar seam; placeholder stays byte-identical "Ask ProbOS..." (AD-719/941 selectors depend on it). ONE source of truth: new pure ui/src/components/bridge/paletteCommands.ts derives the list from the SAME buildBridgeStations(ctx) — buildPaletteCommands flattens (a station's discrete actions when it has any, else its primary onExpand; config EXCLUDED → AD-946a) → 11 launches (Ward Room, Chats, Crew, Personnel, Metrics, Notebooks, Records, Explorer, Work Board, System, Settings; Communications' onExpand skipped → no dup Ward Room), matchPaletteCommands = case-insensitive token-AND substring over ${station} ${label} (empty→[]). ONE additive field onExpandLabel? on CommandStation (operations→'Work Board', engineering→'System') so body-only launches read by the Captain's names (rejected an alias map — a 2nd label list the epic forbids); BridgePanel never reads it → AD-943/944/945 render byte-unchanged. New presentational ui/src/components/CommandPalette.tsx (props-only, nav state on the host like the @-picker): role=listbox/option/aria-selected, stroke-SVG ChevronRight, amber active, NO emoji (HXI #3), "No matching command." empty state. IntentSurface: useMemo'd paletteCommands from 3 read-only selectors (wardRoomDmChannels/missionControlTasks/wardRoomUnread), local paletteOpen/paletteIndex (no new store state/flag/localStorage), handleInputChange >-opens (exclusive w/ the @-picker), handleKeyDown palette nav (returns before the @-picker branch) + Esc-closes-first, handleSubmit top guard so > never POSTs, portal-anchored dropdown + click-outside/onBlur allowance + jsdom-guarded scroll-into-view. NO second global key listener, no config/voice palette entries (AD-946a), no backend/pytest, no push. +22 Vitest (paletteCommands / CommandPalette / IntentSurface.commandPalette + 1 stations onExpandLabel case); full UI suite 1366 passed / 1 skipped (228 files, +22 over the 1344 AD-945 baseline, zero regressions); npm run build clean (tsc -b + vite); npx playwright test 4 passed (the AD-941 specs never touch the omnibox submit). COMMITTED LOCAL ONLY — NOT pushed. CLOSES the "Bridge as Ship's Computer" epic (#873). Forward marker: AD-946a (voice command of stations + config palette entries)) | Bridge as Ship's Computer (#873 / #881) | 3 |
| AD-946b | A bare > in the command palette lists every station launch — matchPaletteCommands('', …) now returns ALL commands (a fresh commands.slice()) instead of [], so > alone shows the full launch list and typing narrows it; "No matching command." appears only for a genuinely non-matching query (>xyzzy). — SHIPPED 2026-06-09 gate-verified (frontend only, a 1-line behavior change in the pure matchPaletteCommands helper; both call sites already gated on the leading > so the NL /api/chat path + the > trigger + the registry source-of-truth + the CommandPalette empty-state are untouched; the AD-946 "empty→[]" test rewritten to "empty→all" + a not.toBe fresh-array guard; full UI suite 1366 passed / 1 skipped unchanged, npm run build clean; closes the AD-946 discoverability gap). Bridge as Ship's Computer (#873). |
Phase 2 — meeting mode (voice + avatars; "transition the chat into a group video call"). A meeting is a live MODE of a Phase-1 group chat — the chat thread is the transcript; "Start Meeting" activates voice + an avatar gallery for the participants. Builds on the mature voice/avatar stack: fleet avatar telemetry already fans out by agent_id (AD-722b-4), per-agent voice profiles (AD-718), viseme lip-sync (AD-721b), VRM crew avatars (AD-721), offline STT + VAD (AD-705a). Depends on AD-913/914/915.
| AD | Title | Epic | Priority |
|---|---|---|---|
| AD-920 | Meeting mode + avatar gallery — "Start Meeting" promotes a group chat to a live meeting (metadata.meeting_active); a gallery view renders all participant VRM avatars at once, bound to the fleet avatar-telemetry stream (AD-722b-4) + CrewVRM; the thread remains the transcript — SHIPPED 2026-06-07 gate-verified |
Meeting mode | 2 |
| AD-921 | Sequenced meeting voice — agent turns spoken via per-agent voice profiles (AD-718) so each sounds distinct, ordered by the AD-915 facilitator (no talk-over), driving viseme lip-sync (AD-721b) on the speaker's avatar — SHIPPED 2026-06-07 gate-verified | Meeting mode | 2 |
| AD-922 | Captain voice input to the group — STT (AD-705a) + push-to-talk/VAD captures the Captain's speech and routes it through the AD-914 group fan-out (not the 1:1 agent_chat path) — SHIPPED 2026-06-07 gate-verified |
Meeting mode | 2 |
| AD-923 | Meeting presence + speaking indicator — who's-speaking highlight in the gallery, join/leave, raise-hand/turn signaling (HXI #4 motion = state); meeting end writes a transcript/summary back to the thread — SHIPPED 2026-06-07 gate-verified (frontend-only; closes the Ad-hoc Crew Collaboration epic AD-913 → AD-923) | Meeting mode | 3 |
| AD-924 | Agent-facing group-chat trigger + crew awareness — the emitter AD-918 deferred: a proactive [GROUP_CHAT title="..." @cs,@cs] tag extractor dispatches the existing create_group_chat (Commander+ gate via CommunicationsConfig.group_chat_min_rank, the AD-918 cooldown/cap honored), plus a federation.md standing order and a config/manuals/group-chat.md so the crew know the feature exists — SHIPPED 2026-06-08 gate-verified (follow-up to the closed epic; makes AD-918 usable by the crew) |
Ad-hoc group chat | 2 |
| AD-939 | Meeting Captain slot (camera > screen > icon) — MeetingView renders the Captain as the FIRST gallery cell: live camera (preferred) or screen video when shared (read-only reuse of the AD-733 / AD-733-2 streams), else an amber stroke-SVG person icon (HXI #3); crew avatars were already fixed by AD-938 (an empty gallery = the un-hydrated thread). — SHIPPED 2026-06-09 gate-verified (frontend only; CaptainSlot sibling of AvatarSlot gated on useCameraStore.active / useScreenStore.active, attaches the MediaStream to a muted <video> via a ref+effect; the captain-present chip kept. NO AvatarSlot/CrewVRM/fleet-binding change, no new capture, no audio. +6 Vitest (MeetingView.captain.test.tsx); full UI suite 1328 passed / 1 skipped (224 files, +11 over the 1317 baseline, zero regressions), npm run build clean. No backend → no pytest) |
Meeting mode | 2 |
| AD-949 | Call-scoped group/meeting voice — group/meeting voice gets its OWN audio gate (callAudioEnabled, default ON), decoupled from the Ship's-Computer voiceEnabled, plus an in-call mute/unmute control on GroupChatHeader — so a live call is audible by default without enabling the Engineering voice (the structural fix for what BF-614 triaged as "the AD-945 voice-output toggle was likely off"). — SHIPPED 2026-06-09 gate-verified (frontend only, a gate-flag swap + an additive in-call control. New session-scoped store flag callAudioEnabled: boolean (default true) + setCallAudioEnabled mirror the voiceEnabled slice MINUS the localStorage line; useMeetingVoice.ts:50 swaps voiceEnabled → callAudioEnabled — the one functional line; the meetingActive self-gate (:49, runs first), the AD-921/922/923 speakRepliesSequentially sequencer, the per-agent AD-718 voice profiles, the AD-923 speakingAgentId seam, and the AD-922 echo gate are byte-for-byte intact. Default-ON is the decoupling: meetingActive && callAudioEnabled makes a fresh call audible WITHOUT the Ship's-Computer voice. GroupChatHeader renders a data-testid="call-audio-toggle" mute/unmute button only while meetingActive (local inline speaker stroke-SVG, no emoji / no Glyphs.tsx export, amber #f0b060 audible / dim #666680 muted, aria-pressed = callAudioEnabled). The 1:1 per-agent TTS (hxi_chat_tts_{agentId}), the Ship's-Computer / wake-word voice (IntentSurface, the AD-945 BridgeEnvironment toggle, the global voiceEnabled), and all STT/TTS engines unchanged. useMeetingVoice.test.tsx migrates the gate flag (verbatim rename, +0); GroupChatHeader.meeting.test.tsx +3 cases; full UI suite 1376 passed / 1 skipped (+3 over the 1373 baseline, zero regressions), npm run build clean, the 4 AD-941 Playwright specs 4 passed. No backend → no pytest. Forward markers: AD-949a persist the mute across sessions, AD-949b also surface the control on MeetingView, AD-949c honor per-agent 1:1 overrides in group) |
Meeting mode | 2 |
| AD-950 | Conversational proactivity — teach agents to ADVANCE a live conversation instead of answering and stopping: a new overridable CognitiveAgent._conversational_proactivity_protocol hook appends calibrated guidance to end an ENGAGED 1:1/group direct_message turn with ONE forward move (follow-up question / inviting observation / concrete proposal), recipient-designed (react to specifics, address by name), explicitly NOT every turn (calibrated to engagement + personality, not interrogation), honest disagreement welcomed, never inventing a follow-up (AD-592 honesty preserved); group chats additionally permit handing the floor to a named peer by callsign (sets up AD-951). The CORE Turing-test lever of the Natural Conversation epic. — SHIPPED 2026-06-09 gate-verified (backend only, instructions-first — Design Principle #6; the behavior lives in INSTRUCTIONS, never decide()/act(). Returns "" off the direct_message path or when CommunicationsConfig.proactive_conversation_enabled (default ON, on even when config absent) is off; invoked AFTER the AD-935 group decline protocol so it composes with reply-only-when-substantive. A cognitive_agent hook not a standing order because the DM/group reply path composes with hardcoded_instructions="" — a standing order is DEAD on this surface. test_ad950_conversational_proactivity.py 9 tests, BF-287 real CommunicationsConfig + real hook, both renderings audited against the real _CAPABILITY_GAP_RE — clean; blast-radius 1519 passed / 1 skipped, zero regressions. No UI change) |
Conversation quality | 3 |
| AD-951 | Agent next-speaker selection — turn-allocation rule 1a ("current speaker selects next") for agent-to-agent hand-offs: an agent that DIRECTLY ADDRESSES a peer by callsign (@yeo … or the vocative Yeo, … / Yeo: …) hard-includes that peer as the next cascade speaker (overriding the per-turn cap + convergence, still bounded by max_agent_rounds), making AD-950's "hand the floor to a named peer" mechanical. The facilitator already did rule 1a for the Captain's @-mentions; AD-951 extends it to the crew. — SHIPPED 2026-06-09 gate-verified (backend only, routing — no new prompt text. New pure crew_profile.extract_directed_callsign (LEADING @/vocative address only — is_directed_mention/BF #467 discipline, so a message ABOUT a peer is not a hand-off; the VOCATIVE form is the new value, essential for AD-921 voice meetings). group_chat_fanout scans each prior-round reply's CLEAN text (AD-948-stripped) and threads the addressed callsign through _fan_one_round → _assemble_speaker_signals → mentioned=True. Honest-degrade: non-participant callsign ignored; no address → byte-identical AD-935. New GroupChatConfig.agent_next_speaker_selection_enabled default OFF (#14), system.yaml flips on. test_ad951_next_speaker_selection.py 16 tests, BF-287 real ChatThreadStore+IntentBus+registry, incl. an end-to-end ON-vs-OFF discriminator through group_chat_fanout; blast-radius 1381 passed, zero regressions. No UI change) |
Conversation quality | 3 |
| AD-952 | Human response dynamics — group replies arrive one at a time behind a "{callsign} is typing" beat (length-proportional, clamped [360,2400]ms), not all in the same instant (the clearest group-chat bot tell). — SHIPPED 2026-06-09 gate-verified (frontend only, additive. NEW pure DI sequencer chat/staggerReplies.ts (computeTypingDelay + revealRepliesProgressively — per reply: show typing → sleep → clear → append; Tier-2, always clears in finally, abortable; mirrors the AD-921 voice sequencer, fake-clock testable). Session-scoped typingAgent store slice (AD-949 callAudioEnabled pattern, no localStorage). TypingIndicator component — three amber #f0b060 dots pulsing via a self-contained @keyframes hxi-typing-blink, aria-live, no emoji (HXI #3), motion = state (#4). ProfileChatTab.sendText extracts a shared appendReply (DRY) and reveals progressively in TEXT chat; in a LIVE MEETING the AD-921 voice + AD-923 indicator already pace the crew so the text stays instant (byte-identical). Variable LENGTH already taught by AD-950 (no duplicate clause); true streaming stays the AD-935a forward marker. staggerReplies.test.ts 12 + TypingIndicator.test.tsx 5 + useStore.typingAgent.test.ts 4; full UI suite 1397 passed / 1 skipped (+21 over the 1376 baseline, zero regressions), npm run build clean, the 4 AD-941 Playwright specs 4 passed. No backend → no pytest) |
Conversation quality | 3 |
| AD-953 | Conversational memory & callbacks — teach agents to draw on what they GENUINELY recall (the episodic memories + session history already in the reply context, AD-573/AD-723a-1) and make natural callbacks ("you mentioned …", "last time we …") so a conversation feels continuous, under a hard AD-592 honesty bound (reference only what is actually recalled; never fabricate a shared memory). — SHIPPED 2026-06-09 gate-verified (backend only, instructions-first — Design Principle #6. New overridable CognitiveAgent._conversational_memory_protocol hook, sibling to AD-950's, invoked right after it; returns "" off the direct_message path or when CommunicationsConfig.conversational_memory_enabled (default ON) is off. Honest by construction: the agent really receives recalled episodes, so the guidance points at real content and the top-stated honesty rule fences invention. test_ad953_conversational_memory.py 9 tests, BF-287 real CommunicationsConfig + real hook, audited against the real _CAPABILITY_GAP_RE (incl. the lack token) — clean; blast-radius 1531 passed (the only 4 failures are the pre-existing TestSkillPipeline serial-isolation flakes, green in isolation), zero real regressions. No UI change) |
Conversation quality | 3 |
| AD-955 | Advisory room awareness — the AD-915 facilitator becomes a SENSE ORGAN the agents read, not a director: its per-speaker ranking (recent share, your-area, the peer the room would most value hearing) is surfaced to the dispatched speaker so a dominating agent can hold back / hand off and an agent can defer to a better-placed peer BY NAME (an AD-951 hand-off), reframed "teamwork, never a shortfall" to dissolve the ego problem. Weighted-trust frame from the Captain + Counselor Ezri. — SHIPPED 2026-06-09 gate-verified (instructions-first, ADVISORY — NO dispatch change; the cap/convergence backstops are untouched. Determinism in PERCEPTION is fine, in AGENCY feels robotic — so the driver moves into the agents' reasoning. Pure chat_facilitator.build_room_signal → thread_fanout rides it on params["room_signal"] → CognitiveAgent._conversational_room_awareness_protocol frames it, led by "for your judgment, not a directive" + "most turns simply contribute" so it self-regulates without going quiet. CommunicationsConfig.room_awareness_enabled default ON so the Captain can observe self-selection. AD-956 (scale-aware enforcement: advisory below a small-N threshold, gating above) deferred — the threshold is empirical, calibrated after watching, not guessed. test_ad955_room_awareness.py 19 tests, BF-287 real config/store/bus, pure helper + hook gap-audited + end-to-end through group_chat_fanout; blast-radius 2063 passed (the only 4 failures are the pre-existing TestSkillPipeline serial-isolation flakes, orthogonal to AD-955). No UI change) |
Conversation quality | 3 |
| AD-984a | Hide-chat toggle in meeting mode — a meeting is one conversation viewed by voice; let the Captain hide the text transcript for an avatars+voice-only call. — SHIPPED 2026-06-11 gate-verified (frontend only. Session-scoped meetingChatVisible store flag (default ON, AD-949 callAudioEnabled pattern, no localStorage) + chat-visibility-toggle on GroupChatHeader shown only while meetingActive (speech-bubble stroke-SVG, amber shown / dim+slash hidden, aria-pressed, no emoji); ProfileChatTab gates the transcript message-list on !(meetingActive && !meetingChatVisible) — MeetingView + composer stay, conversation/persistence/BF-621 reveal unaffected. GroupChatHeader.meeting.test.tsx +3; full UI 1453/1 skip, build clean. AD-984b (condensed-sidebar presentation + WCAG 2.1 AA) deferred) |
Conversation quality | 3 |
| AD-984c | Interruptible auto-scroll — new messages auto-scroll only when the Captain is pinned to the bottom, never yanking them down while reading an earlier turn. — SHIPPED 2026-06-11 gate-verified (frontend only. Pure chat/scrollAnchor.ts isPinnedToBottom(el, 80) + a pinnedToBottomRef on the message-list onScroll; the auto-scroll effect fires only when pinned. scrollAnchor.test.ts 7. No backend) |
Conversation quality | 2 |
| AD-984d | Captain-local timezone in crew context — TemporalConfig.captain_timezone (IANA) injects the Captain's current local time into _build_temporal_context alongside UTC, removing the confabulation source (crew inferred "3am" from UTC when it was 9pm Mountain). — SHIPPED 2026-06-11 gate-verified (backend. New config field (default "" = unchanged, UTC only); zoneinfo.ZoneInfo conversion, honest-degrade on a bad zone; tzdata added to deps (Windows has no system IANA db). test_ad984d_captain_timezone.py 6, BF-287 real SystemConfig) |
Conversation quality | 2 |
| BF-622 | Scaffolding-echo guard — the AD-978 Current Visual Context block (LLM input) leaked into the chat when a degraded LLM proxy echoed its own prompt; strip it from a reply before persist. — SHIPPED 2026-06-11 gate-verified (backend. Pure strip_visual_context_block in working_memory.py (regex on the exact render_for_prompt delimiters — single source of truth) applied at the 1:1 + group reply chokepoints, guarded on the marker substring so normal replies are untouched; an echoed-only reply degrades to the non-reply sentinel. test_bf622_strip_visual_context.py 8, incl. the render-delimiter round-trip; backend blast 1266 passed) |
Robustness | 2 |
| AD-983a | Capability affordance layer — the tool carries its own manual (#914, the AD-983 Copilot-parity epic). A capability's affordance ("how to invoke a tool you hold") travels WITH the capability and auto-surfaces to every crew agent that can reach it, instead of being authored into one agent's behavior rules (Yeo). — SHIPPED 2026-06-11 gate-verified (new IntentDescriptor.usage_hint declared on the 5 [MESH]-backed read intents; CognitiveAgent.capability_affordances() derives {intent: usage_hint} from the LIVE registry — substrate-gated, only serving agents contribute; the base _conversational_capability_block (was "") renders them, sorted, into every crew agent's conversational prompt. Yeo's capability override + _available_mesh_read_intents retired; its task protocol keeps only the [CREATE_TASK] delegation judgment. Folds in AD-957/#893 (web search for all crew). test_ad983a_capability_affordances.py 18 (BF-287 real registry+descriptors, incl. a non-Yeo Counselor getting the affordance with no override); test_ad870/test_bf599 rewritten; regression blast 1110 passed / 6 skipped) |
Copilot parity | 3 |
| AD-983b | Per-agent capability enablement — per-agent skill grants + unified /capabilities API (#915, the AD-983 Copilot-parity epic). Each crew agent independently enables different tools AND skills; tools already had per-agent grants, skills were dept/rank-only. — SHIPPED 2026-06-12 gate-verified (new SkillGrantStore (cognitive/skill_grants.py, mirrors ToolPermissionStore minus the permission level — skills are binary); CognitiveSkillCatalog.set_grant_store + effective_entries_for_agent = dept/rank defaults ∪ grants − restrictions (back-compat to list_entries without a store); onboarding intent-subscription resolves through the overlay; GET /api/agent/{id}/capabilities → {tools,skills} with granted+source, POST .../capabilities/set {kind,id,enabled,reason} audit-logged via new CAPABILITY_ACCESS_RESOLVED; store wired in startup/communication.py → runtime.skill_grant_store → shutdown. Independence proven: a skill granted to A is in A's effective set, not a same-dept peer B's. test_ad983b_capability_enablement.py 16 (BF-287 real store + real catalog with on-disk SKILL.md)) |
Copilot parity | 3 |
| AD-983c | Capability panel UI — the Captain surface for per-agent tool/skill enablement (#916, the AD-983 Copilot-parity epic). Generalizes the AD-982 single vision toggle into a reusable panel exposing the FULL tool + skill set per agent, bound to the AD-983b API. — SHIPPED 2026-06-12 gate-verified (new CapabilityPanel (ui/src/components/profile/CapabilityPanel.tsx): GET /capabilities → {tools,skills}, POST /capabilities/set {kind,id,enabled}; renders TOOLS (n)+SKILLS (n) sections, each row a toggle (aria-pressed, On/amber·Off/dim) + name + human source label (granted/restricted/role default/dept default); optimistic with revert-on-failure (AD-982 pattern); deps prop injects fetchers for tests; HXI stroke-only, no emoji. Mounted on personnel ServiceRecord (sr-section-capabilities) + the ProfileInfoTab Profile tab (crew-only). CapabilityPanel.test.tsx 10 (deps-injected — states, On/Off+aria-pressed, source labels, optimistic POST, revert-on-fail, refetch-on-agentId, no-emoji); ProfileInfoTab.wakePhrase.test.tsx updated to find the voice-profile PUT by URL (mount shifted fetch ordering); full UI 1463 passed / 1 skipped, build clean) |
Copilot parity | 2 |
| AD-983d | Manifest + lazy retrieval — the deferred-tool model, scale to hundreds (#917, COMPLETES the AD-983 Copilot-parity epic). PromptBuilder rendered every descriptor's full param table; at hundreds of intents that blows the context budget AND degrades selection. Adopt the Copilot deferred-tool shape: cheap always-loaded manifest + lazy find_intents for full detail. — SHIPPED 2026-06-12 gate-verified (new CapabilityRetriever (cognitive/capability_retriever.py) = tool_search for ProbOS: manifest(scope) → [(name, one_line)]; find_intents(concept, scope, k, dense_ranking=None) → full descriptors ranked by the AD-979c reciprocal_rank_fusion over two lexical axes (name-tokens + full-text incl description/usage_hint/params), reusing fts_or_query — deterministic, vocabulary-mismatch safe; dense_ranking = forward semantic-axis extension point. New CognitiveConfig.deferred_capability_threshold: int = 0 (0 disables → byte-identical). build_system_prompt(..., manifest_mode=False) + _build_intent_table(..., manifest_mode=False): manifest mode keeps core/utility full (tiering), renders the domain tier as a name+one-line manifest. Decomposer _use_manifest() gate + runtime.py wiring. Skill bodies already lazy (AD-596). test_ad983d_deferred_capability.py 21 (BF-287 real descriptors + real retriever/PromptBuilder/registry); regression blast 455 passed / 6 skipped. The AD-983 epic a→b→c→d is complete) |
Copilot parity | 4 |
| AD-981a | Recall-confidence on the live sovereign path — Feeling-of-Knowing logging for recall_for_agent (the earmarked AD-981 recall live-wiring). AD-979a/c built the FoK signal + hybrid axis on recall() but the crew's sovereign recall goes through recall_for_agent (AD-397), so they were mechanism-only — the invisible miss stayed invisible on the path that matters. — SHIPPED 2026-06-12 gate-verified (new recall_for_agent_with_confidence(agent_id, query, k) = single source of truth, surfaces an AGENT-SCOPED band over the agent's own candidates tallied pre-threshold (Koriat), classified vs the general relevance_threshold/weak_floor so it's comparable to the global AD-979a band; recall_for_agent is now a byte-identical shim. New default-off MemoryConfig.recall_fok_logging_enabled (threaded via __main__.py) emits the band per sovereign recall as a structured INFO line (agent/band/best_sim/owned-candidates/returned/query), logged even on empty recall — a live session becomes a recall-calibration tool. Default off → byte-identical episodes + zero log noise; band logged not yet driving the AD-979b expand loop on this path (deferred AD-981b). test_ad981a_recall_fok_logging.py 10 (BF-287 real EpisodicMemory + real embeddings, incl. agent-scoped-not-global + shim byte-identity); episodic/recall regression blast 737 passed / 1 skipped) |
Oracle recall | 2 |
| BF-623 | 1:1 conversation-mode gate fix — selecting Conversation mode behaved like push-to-talk because arming was gated on the global Ship's-Computer voiceEnabled flag (default OFF), the BF-614/AD-949 mis-gating class. — SHIPPED 2026-06-12 gate-verified (frontend. Conversation mode is ITSELF the voice opt-in: the 1:1 arm effect depends ONLY on the mic-mode selection, not any voice-output flag (the open mic is an input affordance; whether a reply is spoken stays a per-reply hxi_chat_tts_{agentId} decision in onAgentReply). Two obsolete-contract tests updated (conversationWiring voiceEnabled-flip no longer disarms; bf290 TTS-disabled conversation arms). Paired with AD-985) |
Natural Conversation | 1 |
| AD-985 | Group-meeting open-mic — continuous/hands-free listening in a call. The AD-747 ConversationController is a 1:1 duplex (POSTs to /api/agent/{id}/chat); a group meeting used composer PTT (AD-922/973) so a call could never be truly hands-free. — SHIPPED 2026-06-12 gate-verified (frontend. Generalize the singleton controller with two optional ArmOptions hooks: submitTranscript(text,history) routes a completed utterance through the live sendText group fan-out (AD-914) instead of the 1:1 POST — useMeetingVoice speaks the replies so the controller goes straight to silence_pending, mic stays live; canListen() = the AD-922 meeting-wide echo gate (drop a transcript while speakingAgentId != null — ANY crew member mid-TTS, not BF-300's host-filtered ref; a dropped echo refreshes the 30s release timer). ProfileChatTab arm effect RELOCATED below sendText to read meetingActive/callAudioEnabled/sendText/speakingAgentId; branches not-conversation→disarm, meeting→open-mic (gated on AD-949 callAudioEnabled, default ON, echo gate + group submit, no onAgentReply), else→1:1 (BF-623). conversationController.group.test.ts 6 (mocked boundary) + ProfileChatTab.voicewave.test.tsx 11 (?raw source). Full UI 1480 passed / 1 skipped, build clean) |
Natural Conversation | 3 |
| AD-986b | Transcript-grounded recall — consult the canonical record (the recording) (#928, the AD-986 dual-layer conversational-memory epic). An agent's sovereign episodic shard is a subjective, lossy recollection (truncation + eviction); the ChatThreadStore transcript is the objective record. Live finding: the Counselor honestly couldn't recall a group chat she contributed to because her substantive turn survived only as the trigger-text of the next speaker's (owned) episode. — SHIPPED 2026-06-12 gate-verified (new ChatThreadStore.threads_for_participant (sovereign-scoped — only rooms the agent took part in) + pure cognitive/transcript_grounding.py: consult_transcript (deterministic AD-979c fts_or_query lexical match over the agent's OWN rooms, bounded excerpt centred on matching lines, None on no match → sparse injection) + render_transcript_grounding (labels it === CANONICAL TRANSCRIPT (the recording) ===, distinct from subjective memory). Wired into _recall_relevant_memories (observation["_transcript_grounding"], Tier-2) + _build_user_message. New MemoryConfig.transcript_grounded_recall_enabled (default off → byte-identical). Sovereign scope load-bearing: an agent never gets a transcript for a room it wasn't in. Prior art: transactive memory (Wegner — external recordings), distributed cognition (Hutchins), contagion firewall (de-risks AD-979 frontier). test_ad986b_transcript_grounding.py 14 (BF-287 real store, incl. the unowned-room leak guard); regression blast 1633 passed / 1 skipped) |
Shared-conversation memory | 3 |
| BF-624 | Shared-meeting vision refreshes a STALE ring, not just an empty one (#882). Live: the Counselor described the live frame (plaid shirt) while the Yeoman described a 22h-old frame (black shirt) for the whole group chat — _render_agent_scene_block shared the consumer's latest obs into a participant's ring ONLY when empty (BF-617/BF-620), but the Yeoman's ring held a stale disk-hydrated frame (AD-742f) so it never refreshed. — SHIPPED 2026-06-12 gate-verified (generalize empty→stale: share latest_shared_observation() whenever it is fresher than the ring's own latest (_own is None or _shared.timestamp > _own.timestamp); byte-identical for an up-to-date observer (no append), a fresher own frame is kept; meeting-scoped, no register_observer. test_bf624_stale_meeting_vision_refresh.py 6 (BF-287 real consumer+rings: 22h-stale→live headline + empty preserved + no-downgrade + sentinel + disabled gate); BF-617/BF-620/AD-978 30 passed; fan-out/perception/temporal blast 526 passed. Also set temporal.captain_timezone: America/Denver (AD-984d feature shipped but config was empty "") + enabled memory.transcript_grounded_recall_enabled for the AD-986b test) |
Natural Conversation | 2 |
Task Workspace Rooms northstar — the auto-created task room becomes a Cowork-style (Anthropic/Microsoft) WORKSPACE: crew "show their work" by posting status updates as they work + the final result; multi-agent collaboration on a decomposed task is visible in one room; an Input folder holds files attached to the task (context / to-process) and an Output folder holds artifacts the crew produce (documents, images, slides) (2026-06-08). RULED (autonomous, Captain away): substrate is Option A — extend the chat-thread + ArtifactStore world, NOT bridge ConsultationWorkspace. Rationale: ArtifactStore (AD-797) is ALREADY thread_id-keyed + versioned + content-addressable with a live ArtifactDrawer/ArtifactViewer UI = the Output folder, thread-scoped already; AD-916 message attachments = the Input seed; the room's messages = the work log; chat_threads.task_id (AD-791a) + workspace_root (AD-799) are the link slots. ConsultationWorkspace (AD-594a, inputs/advisory/plan/artifacts/outputs/workitems + lifecycle) is a heavier filesystem/consultation substrate — kept separate. This is assembly of existing primitives, not a new substrate. Conversational+task counterpart to the Crew-collaboration spine (AD-858–862) and the Ad-hoc group chat epic (AD-913–924). Highest committed AD at authoring: AD-924.
| AD | Title | Epic | Priority |
|---|---|---|---|
| AD-925 | Auto-create the task-linked workspace room — when CrewTaskExecutor (AD-859) fans a parent WorkItem out to ≥2 crew, open ONE group chat with task_id set + the child-assignees as participants (reuse AgentGroupChatService.create_group_chat(task_id=…), AD-918 rate guards honored), tagged so it surfaces in the AD-919 list + Captain-joinable. The room IS the workspace. Foundation. — SHIPPED 2026-06-08 gate-verified (default-OFF GroupChatConfig.auto_task_room_enabled + doubly-gated by agentic_dispatch.orchestrator_enabled; first-crew-assignee creator, no service change; task_id-filter idempotency) |
Task workspace | 1 |
| AD-926 | Inputs folder — surface the files attached to the task / parent work-item as the room's Input pane (reuse AD-916 message attachments + content-addressable AttachmentStore); read-only context the crew can open/process. — SHIPPED 2026-06-08 gate-verified (read-only GET /api/threads/{thread_id}/inputs returns a content_hash-deduped union of the additive WorkItem.metadata["input_attachments"] convention — source="task", population deferred → AD-926a — + AD-916 message attachments — source="message"; honest-degrade size=None; bytes reuse GET /api/chat/attachments/{content_hash}; self-contained no-emoji InputsList drop-in, mounting deferred → AD-929) |
Task workspace | 2 |
| AD-927 | Outputs folder — mount the existing ArtifactDrawer (AD-797, thread-keyed + versioned) on the task room as the Output pane; agents write artifacts via a [ARTIFACT name="…"]…[/ARTIFACT] proactive action tag (the AD-924 [GROUP_CHAT] extractor pattern), so produced documents/images/slides land in the room. — SHIPPED 2026-06-08 gate-verified (a new _ARTIFACT_PATTERN + ProactiveCognitiveLoop._extract_and_execute_artifacts writes the verified two-call path sha256 → attachment_store.write(origin="agent_artifact") → artifact_store.add_version(thread_id=room.id, …); participation-based binding via a new _resolve_agent_task_room — most-recently-active non-archived thread with a task_id that lists the agent (richer current-work-item binding deferred → AD-927a); Lieutenant+ via new CommunicationsConfig.artifact_min_rank + anti-flood artifact_max_per_turn=3 / artifact_max_bytes=262144; v1 text/markdown; honest-degrade on no-room / empty / oversize / cap / write-fail; surfaces in the existing AD-797 ArtifactDrawer off activeThreadId — no new UI / REST / EventType) |
Task workspace | 2 |
| AD-928 | Show-your-work status protocol — a federation.md standing order + a structured progress/final-result message convention tied to the work-item lifecycle (in_progress → done), so the crew narrate their work in the room and post a clear final result. — SHIPPED 2026-06-08 gate-verified (a new module-level _STATUS_PATTERN parses one [STATUS]…[/STATUS] tag family with an optional final flag — [STATUS final]…[/STATUS] — NOT two tags, DRY: one regex/extractor/gate/cap, final is a boolean on the same kind; a new ProactiveCognitiveLoop._extract_and_execute_statuses posts each block via chat_thread_store.append_message(room.id, author_id=agent.id, role="agent", body, metadata={"kind":"status"[, "status_final":true]}) into the agent's task room, reusing the AD-927 _resolve_agent_task_room participation resolver unchanged, so the marker surfaces on the existing GET /api/threads/{id}/messages. Lieutenant+ via a new CommunicationsConfig.status_min_rank + anti-flood status_max_per_turn=3 / status_max_bytes=4096; honest-degrade on no-room / empty / oversize / cap / post-fail. MESSAGE-ONLY v1 — the work-item lifecycle transition (in_progress → done) is deferred → AD-928a (no clean current-work-item handle in the proactive context); UI status-chip render is deferred → AD-928b (the backend persists + surfaces the marker, so it is a pure presentational follow-up). No new REST route / EventType / store method / config class) |
Task workspace | 3 |
| AD-929 | Unified workspace HXI view — a task room reads like a Microsoft Teams channel (Conversation + Files), reusing the AD-926/927 panes + the AD-917 chat. — SHIPPED 2026-06-08 gate-verified (a collapsible right-hand Files rail INSIDE ProfileChatTab — the only mount common to both hosts, CompactApp tray + AgentProfilePanel full-HXI — keeping the conversation as the primary column with an INPUTS section over an OUTPUTS section; chosen over a tab bar, which hides the convo + AgentProfilePanel's tabs are per-agent not per-room. Gated by a new pure isWorkspaceRoom(thread, agents) → task_id set OR ≥2 crew participants (verbatim GroupChatHeader crew filter), so a 1:1 DM shows no rail. Pure assembly: INPUTS = the AD-926 InputsList fed by fetchThreadInputs; OUTPUTS = the lighter AD-797 ArtifactList — NOT the full ArtifactDrawer, so the drawer is never double-mounted and CompactApp is untouched — fed by fetchThreadArtifacts, rows open /api/artifacts/{id}/content in a new tab. Self-contained fetch with honest-degrade to []; collapse persisted under localStorage probos.workspaceFiles.collapsed, default-collapsed on first run (the 420px floating panel host); local inline stroke-SVG glyphs, no emoji. Frontend-only — no backend/REST/pytest/store-slice/Glyphs.tsx/CompactApp change. Forward markers: AD-929a inline ArtifactViewer, AD-929b gate the standalone CompactApp drawer to non-workspace threads, AD-929c a per-room Files tab in the full-HXI panel. +14 Vitest, build clean) |
Task workspace | 3 |
| AD-930 | Crew presence layer — a Microsoft-Teams-style per-crew presence dot (offline \| online \| working \| in_meeting) on the roster; the last Teams-signature gap after group chat (AD-913→919) / meetings (AD-920→923) / task rooms (AD-925→929). — SHIPPED 2026-06-08 gate-verified (a read-only polled GET /api/crew/presence on the existing /api/crew router aggregates three verified-existing signals, inventing no telemetry: liveness = BaseAgent.is_alive; in_meeting = the agent is a participant of a non-archived thread with metadata.meeting_active (AD-920); working = AgentMeta.last_active within a new CommunicationsConfig.presence_working_window_seconds (default 90.0) — an honest recent-activity proxy (last completed op), NOT a true in-flight flag (none exists at HEAD → AD-930a). Liveness is the floor (not alive → offline); among alive agents in_meeting > working > online; Tier-2 honest-degrade on registry/store/config-missing. UI = a pure no-emoji PresenceDot (green / amber-pulse / blue / dim, data-presence + data-pulse), a presence store slice + fetchPresence polled every 10s while CrewRosterPanel is open, a dot per CrewRow, and an optional backward-compatible presence? prop on AgentAvatarBadge. Polled GET over a WS stream for v1. +9 pytest, +12 Vitest, build clean. Forward markers: AD-930a true in-flight "working" signal, AD-930b global app-wide badge poll) |
Presence | 3 |
| AD-926a | Task-level multi-file input upload — let the Captain attach one OR MORE files to a task (work item) as context inputs; the WRITE/population path for the AD-926 input_attachments convention (the AD-926 read endpoint + the AD-929 Files rail already work). — SHIPPED 2026-06-08 gate-verified (new POST /api/work-items/{work_item_id}/inputs multipart files: list[UploadFile] on the existing workforce router; each file validated + stored once via the SHARED chat._validate_and_store_attachment — same AD-720a defense-in-depth chain, default origin="chat_attachment", never age-reaped — then a {content_hash, mime, filename} ref appended to WorkItem.metadata["input_attachments"] via a SINGLE read-merge-write that preserves every other metadata key + existing inputs and dedupes by content_hash (update_work_item is a whole-column REPLACE with no BEGIN IMMEDIATE, so the merge is at the call site; all files stored first → one write kills the intra-request race). Honest-degrade per file → skipped, never a 500; 404/503 mirror the sibling routes; operator authority, no consensus (Safety Budget). Files then surface via the existing AD-926 GET /api/threads/{id}/inputs source="task" and the AD-929 rail. UI = a taskId? prop on WorkspaceFilesRail + a multi-file "+ Attach" button (hidden <input type="file" multiple>, inline stroke-SVG, no emoji) gated on taskId, and inputsApi.attachTaskInputs posting one multipart files[] request that refreshes the rail. +10 pytest, +10 Vitest, build clean. Forward markers: AD-926a-1 delete-input, AD-926a-2 atomic store-level append) |
Task workspace | 2 |
From the 2026-05-20 Yeo feature-complete decomposition (gap doc + AionUi pattern absorption):
| AD | Title | Issue | Priority |
|---|---|---|---|
| AD-749 | Yeo M365 auth + core connector agents (Outlook/Teams/Calendar/SharePoint/OneDrive) — SHIPPED Wave 181 gate-verified | #695 | 2 |
| AD-750 | WorkIQ-style semantic work layer (unified task/context graph for Yeo) — SHIPPED Wave 181 gate-verified | #696 | 2 |
| AD-751 | Desktop UX surface for Yeo (tray, notifications, hotkey, mini-mode, autostart) — SHIPPED Wave 181 gate-verified | #697 | 3 |
| AD-752 | Proactive schedule heartbeat with work-hours and quiet-hours policy — SHIPPED Wave 181 gate-verified | #698 | 3 |
| AD-753 | Unattended permissions modes (autoApproveReadOnly + approval cards + tenant policy hook) — SHIPPED Wave 181 gate-verified |
#699 | 3 (tenant policy hook is extension-point only) |
| AD-754 | Yeo data hardening baseline (encryption at rest, PII redaction, audit log, forget-this) — SHIPPED Wave 181 gate-verified | #700 | 2 (enterprise governance integrations are extension-point only) |
| AD-755 | Office document skills + SharePoint routing + reusable templates — SHIPPED Wave 181 gate-verified | #701 | 3 |
| AD-756 | Yeo conversational front door UX (welcome, suggested actions, daily briefing, delegation UI) — SHIPPED Wave 181 gate-verified | #702 | 2 |
| AD-757 | Identity and continuity for Yeo (Captain Card + voice/avatar profile continuity) — SHIPPED Wave 181 gate-verified | #703 | 3 |
| AD-758 | Yeo feature-complete integration gate (cross-crew capability exposure + learning upgrades) — SHIPPED Wave 181 | #704 | 2 |
| AD-759 | Yeo native desktop tray app v1 — Electron host + tray menu + single-instance lock + probos:// deep-link + native notifications + disconnected-state repair surface (new top-level desktop/ workspace; MIT deps only; assumes runtime already running per AD-751; +24 vitest) — SHIPPED Wave 186 |
#705 | 2 |
| AD-759a | Launch-at-login (Win registry / macOS LaunchAgent / Linux .desktop autostart) — forward marker | — | 3 |
| AD-759b | NSIS installer + unsigned Windows release artifact — forward marker | — | 3 |
| AD-759c | Auto-update — check-only against GitHub Releases (no managed update server) — forward marker | — | 3 |
| AD-759d | Signed installer + EV code-signing cert workflow (Commercial) — forward marker | — | 3 |
| AD-759e | CI release pipeline (GH Actions matrix build Win/macOS/Linux) — forward marker | — | 3 |
| AD-759-1 | macOS menubar parity — forward marker | — | 3 |
| AD-759-2 | Linux tray parity — forward marker | — | 3 |
| AD-790 | Yeo Desktop first-run setup experience — Claude-Desktop-grade 4-step onboarding (welcome / runtime connect / Captain Card / suggested prompts) inside existing Electron host; firstRunComplete flag in userData/yeo-state.json; tray Reset Setup…; no new deps; +8 vitest |
#714 | 2 |
| AD-791 | Context-scoped chat threads — reframe sessions as Teams-style threads (1:1 / task / project context envelopes); SQLite chat_threads table; /api/threads REST surface; back-compat shim for existing /api/agent/{id}/chat; agent memory stays global, only the displayed context is scoped (umbrella for AD-792/793/794) — SHIPPED Wave 193 (substrate); see also AD-791a (AD-827) SHIPPED Wave 193 for the consumer wiring |
#715 | 2 |
| AD-792 | Thread sidebar in Compact Yeo (Pinned / Projects / Recents / Search / New chat) — left rail mirroring Claude Chat + MS Teams; collapsible; right-click rename/pin/archive — SHIPPED Wave 195 (UI-only; ThreadSidebar.tsx + threadGrouping.ts + threadApi.ts; ProfileChatTab.threadId prop with props.threadId ?? threadIdByAgent.get(agentId) precedence; CompactApp wraps sidebar + chat in flex; 300ms search debounce matches useStore.ts:549 precedent; reads data.threads for list, data.results for search, thread.to_dict() DIRECT for POST; Wave 194 title_locked: true on rename; localStorage probos.sidebar.collapsed persists collapse state; +32 vitest across 10 new files) |
#716 | 2 |
| AD-793 | Projects — long-lived context groups owning N threads + pinned attachments; project description auto-injected as system context to every thread inside; mirrors Claude Projects / Teams Channels — SHIPPED Wave 196 (new projects table additively in threads/__init__.py; new Project + ProjectStore with CRUD + pin/unpin + touch + delete_project(cascade=False) returning (deleted, affected_threads); new src/probos/routers/projects.py; single-project responses return Project.to_dict() DIRECTLY; pin endpoint is async def and awaits AttachmentStore.exists() per architect v2; 400 on missing SHA; project preamble injected BETWEEN AD-725 recall prepend and AD-733a visual prepend so on-the-wire order is visual → project → recall → user — substring-index ordering test guards regression; touch-on-message-append wired at the router layer in routers/threads.py to keep threads substrate decoupled from projects layer; delete defaults to unparent, cascade requires ?cascade=true + double-confirmation in UI; episodes preserved in both paths; type-guard against MagicMock-auto-attribute phantom-injection per BF-287 lesson; UI replaces Wave 195's "Coming with AD-793" placeholder with real expandable Projects section + NewProjectModal + ProjectContextMenu + MoveToProjectMenu + ProjectRow components, all stroke-based SVG icons per HXI #3; Recents filter excludes project_id != null threads; per-project expansion persisted to localStorage['probos.sidebar.projects.expanded']; +12 pytest + 11 vitest; vitest 1005 → 1016) |
#717 | 2 |
| AD-794 | Auto-name new threads from the first turn (fast-tier LLM, 3-6 words, fallback to user opening message); manual rename locks the title — SHIPPED Wave 194 (heuristic via suggest_title; LLM-backed variant deferred to AD-794a forward marker; first-turn trigger in agent_chat + routers/chat.py inline-callsign; PATCH /api/threads/{id} extended with title_locked: bool so operator renames are authoritative; maybe_auto_name(force=False|True) centralizes the heuristic + shares one path with POST /{id}/auto-name) |
#718 | 3 |
| AD-795 | Quick-action starter chips on empty thread (Code / Write / Plan / Brief…); chip click inserts text via store chatDrafts, never auto-sends; project-overridable — SHIPPED Wave 187 |
#719 | 3 |
| AD-796 | Time-of-day greeting + status line on Compact empty state — v1 renders Good morning/afternoon/evening, <name>. + N unread WardRoom threads • N crew online (or All quiet.); captain-name wiring to AD-757 Captain Card endpoint is a forward marker — SHIPPED Wave 187 |
#720 | 3 |
| AD-797 | Artifacts pane — agent-generated outputs (code blocks >40 lines, explicit <artifact> tags) extracted to AttachmentStore (AD-720 content-addressable) and rendered in a side drawer with per-name version history; project-pinned artifacts propagate to every thread in the project — SHIPPED Wave 197 (extractor + content endpoint + drawer + project-pin merge + BF-324; ASCII-hyphen stub format [Artifact: name vN - L lines, mime]; _pinned_from_project per-row flag; new ATTACHMENT_ORIGINS entry agent_artifact; render branches markdown/code/image/uri-list/plain with react-markdown; no Monaco/Prism in v1 — AD-797a/d forward markers) |
#721 | 3 |
| AD-798 | ContainerSandbox — per-task docker isolation as a RuntimeSandbox backend (concrete AD-456b-1). Pluggable-backend Protocol; SecurityConfig.sandbox_backend flips between inprocess (default, trusted) and container (untrusted). probos-sandbox:latest image bundles Python + git + ffmpeg + standard CLI tools. Industry convergence: OpenClaw (Docker default) + Hermes (7 backends incl. Docker / Modal / Daytona / Vercel Sandbox) |
#722 | 2 |
| AD-798a | Hypervisor sandbox backend (Windows Sandbox / Hyper-V / Apple Virtualization.framework) — Cowork-class isolation. (Commercial) | — | 3 |
| AD-798b | Cloud sandbox backends (Modal / Daytona / Vercel Sandbox / Fly Machines) — matches Hermes's seven-backend menu; serverless persistence | — | 3 |
| AD-798c | SSH sandbox backend — dispatch heavy code-exec to a remote machine over SSH (Hermes parity) | — | 3 |
| AD-799 | Per-thread workspace mount for ContainerSandbox (pairs AD-791 + AD-798) — thread metadata gains workspace_root; bind-mounted at /workspace inside the container; HXI "Files" affordance per thread |
#723 | 2 |
| AD-800 | EgressPolicy enforcement at the container network layer — translates AD-456 allowlist to docker iptables / DNS-sidecar; closes the shell-subprocess HTTP-bypass hole that exists in both OpenClaw and Hermes; emits SANDBOX_EGRESS_BLOCKED events |
#724 | 2 |
From the 2026-05-21 OpenClaw + Hermes Agent landscape survey (operator-surface gaps):
| AD | Title | Issue | Priority |
|---|---|---|---|
| AD-801 | probos doctor — single-command operator health diagnostic CLI (parity with openclaw doctor / hermes doctor) |
#725 | 2 |
| AD-802 | DM pairing + Visiting Officer trust integration — pairing-code gate for unknown senders on inbound channels; mints AD-701 DIDs with operator-chosen capability scope; hard prerequisite for AD-803..807 | #726 | 2 |
| AD-803 | Telegram channel adapter (python-telegram-bot, polling + webhook; pairing-gated; matches Hermes "$5 VPS + Telegram" deployment shape) | #727 | 2 |
| AD-804 | Slack channel adapter (Bolt SDK, socket-mode default, pairing-gated DMs, Slack mrkdwn rendering) | #728 | 2 |
| AD-805 | Microsoft Teams channel adapter (Bot Framework SDK + AD-749 M365 auth reuse + Adaptive Cards for A2UI surfaces) | #729 | 2 |
| AD-806 | Matrix channel adapter (matrix-nio + libolm E2EE; federation-aligned, self-hostable homeserver) | #730 | 3 |
| AD-807 | Discord adapter refresh onto AD-472 contract + AD-802 pairing + AD-791 threads + AD-797 artifact embeds | #731 | 3 |
| AD-808 | Migration tool — probos migrate openclaw / probos migrate hermes (cross-ecosystem import of memories, skills, API keys, command allowlists with AD-541b provenance tagging) |
#732 | 2 |
| AD-809 | Per-thread personality override — /personality <name> slash command; concise/formal/socratic/expert/casual registry; composes with AD-793 project context — SHIPPED Wave 194 (registry at cognitive/personality_registry.py; parser+handler at cognitive/commands/personality_command.py; consumption rides AD-791a intent.thread_id into CognitiveAgent.decide() at ~L2117 between AD-596b skill-instr append and the three LLMRequest(system_prompt=composed, ...) sites; registry treated as identity overlay, never replacement, per Section 0 conceptual frame) |
#733 | 3 |
| AD-810 | /insights [--days N] — operator-facing summary of recent agent activity (Hermes parity); pulls from dreaming consolidation + AD-794 thread names — forward marker |
#734 | 3 |
| AD-811 | A2UI — agent-rendered interactive UI surfaces (form / list-pick / multi-select; OpenClaw "Canvas with A2UI" parity); pairs with AD-797 artifacts + AD-445 decision queue — forward marker | #735 | 3 |
| AD-812 | Natural-language scheduled automations — NL parser + /remind + /schedule + POST /api/schedule/nl consuming the existing AD-418 PersistentTaskStore; channel binding is best-effort name→default; no new scheduler — SHIPPED 2026-05-23 |
#736 | 3 |
| AD-813 | Skills hub — community-curated skill catalog + probos skill install <url>; agentskills.io-compatible package format; commercial-overlay seam for curated / verified hub tier — forward marker |
#737 | 3 |
| AD-825 | Drain-before-cancel shutdown semantics — insert drain phase before AD-824 cancel sweep; quiesce DreamScheduler before Phase 1 consolidation — SHIPPED Wave 192 | #760 | 2 |
| BF-296 | Shutdown Phase A — close IntentBus to new dispatches before consolidation; gates all four entry points (broadcast/send/dispatch_async/JetStream _on_dispatch term-not-nak) behind _closed flag; runs before AD-825 quiesce in startup/shutdown.py with 2s grace; honest-degrade via getattr/hasattr for transitional processes — SHIPPED 2026-05-23 |
#771 | 2 |
| BF-300 | PTT mic gate + ttsActive guard — echo-loop fix; 5 PTT result callbacks now explicitly terminate capture before setTimeout(sendText); persistent onSpeechEvent subscription refuses new SR sessions during TTS playback; new MicIndicator muted violet-dashed-ring state — SHIPPED 2026-05-23 |
#774 | 2 |
| BF-301 | Migrate browser STT from abandoned whisper.cpp WASM (upstream HF tag deleted, CDN dead, npm package incomplete) to @huggingface/transformers v3 + Web Worker + Xenova/whisper-tiny.en; new transformersStt.ts/transformersWorker.ts mirror AD-705a surface + new onTransformersProgress first-load channel; cognitive.primary_stt default flipped to transformers, whisper retained as deprecated alias; /api/voice/health adds model field; vite stt-vendor chunk keeps transformers.js out of main bundle — SHIPPED 2026-05-23 |
#775 | 2 |
| AD-826 | Whisper-first STT priority — invert default PTT path so whisper.cpp WASM is primary; browser SpeechRecognition fallback after 2 empty whisper transcripts; new /api/voice/health filesystem probe — SHIPPED 2026-05-22 |
#767 | 2 |
From the 2026-05-08 chat-experience enhancements (Captain's request):
| AD | Title | Issue | Priority |
|---|---|---|---|
| AD-718 | Voice in 1:1 crew profile chat (parity with Ship's Computer chat) | #512 | 2 |
| AD-719 | Multi-agent chat surface (M365 Copilot pattern, @-mention + agent picker) | #513 | 2 |
| AD-720 | Chat attachments — file uploads + image paste + tool attach (v2) | #514 | 3 |
| AD-721 | 3D crew avatars on profile cards (popout, expressions, body language) | #515 | 3 |
From Wave 132 deferred forward markers (AD-706 Browser Tool follow-ups):
| AD | Title | Issue | Priority |
|---|---|---|---|
| AD-706a | Captain-watch streaming bridge — live browser session in HXI — SHIPPED Wave 166 (new routers/browser_stream.py serves multipart/x-mixed-replace MJPEG from Playwright page.screenshot(type="jpeg"); BrowserSession.get_streaming_url() populated when BrowserToolConfig.streaming_enabled is True; 4 new config fields; public acquire_viewer_slot/release_viewer_slot/active_viewers API on BrowserTool (no private member access from router); 3 new EventType values; require_crew_scope extended with ?token= query-param fallback for <img> surfaces, empty-string token explicitly rejected; new BrowserStreamPanel.tsx HXI component; +11 pytest + 4 vitest) |
#516 | 2 |
| AD-706b | Browser session video recording + retention policy — SHIPPED Wave 166 (Playwright record_video_dir opt-in via BrowserToolConfig.recording_enabled; v1 keeps .webm on disk under data/browser-sessions/<session_id>/; new RecordingReaper background sweeper enforces age-based retention + per-session size cap, dispatches blocking FS work through loop.run_in_executor, holds task reference + CancelledError re-raise per Async Discipline; 4 new EventType values; 5 new config fields; new admin endpoints in routers/browser_recordings.py for list/fetch/delete behind require_crew_scope with path-traversal rejection; AD-706b-1 forward marker for ffmpeg MP4 transcode; AD-706b-2 for AttachmentStore promotion; +9 pytest) |
#517 | 3 |
| AD-706c | OmniParser-style vision extraction — SPLIT 2026-05-12 into AD-706c-1 + AD-706c-2 after AD-732 + BF-268..273 made prerequisites concrete | (closed-as-superseded #518) | — |
| AD-706c-1 | Visual verification of Browser Tool actions using existing local vision tier (qwen3.6:27b). Read-only "did the expected outcome appear?" flow. Builds on already-shipped AD-731/BF-268/AD-732 primitives. Demo value: agent narrates its own work | #642 | 2 |
| AD-706c-2 | Coordinate-aware compute_use tier for DOM-less surfaces (Anthropic computer-use / OpenAI Operator-style) — SHIPPED Wave 166 (new compute_use LLM tier as fifth peer of fast/standard/deep/vision in _LLM_TIERS; _TIER_ORDER promoted to module-level constant excluding vision + compute_use per BF-269; ModelRouter bypass extended per BF-273; is_vision_tier_configured extended to recognize "compute_use"; new tools/browser/compute_use.py with action_compute_use_click(intent) running 10-guard stack (8 inherited from AD-732 + Guard #9 coordinate-verify handshake reusing AD-706c-1 action_verify, Guard #10 per-session trust budget); late-bind handler registration after action_verify to avoid circular import; classify_action always tier-3 for compute_use_click; 4 new EventType values; 7 new CognitiveConfig fields + 2 new BrowserToolConfig fields; tier-2 honest-degrade throughout; AD-731 invariant preserved via SHA-256 refs through AttachmentStore; +16 pytest tests + 4 existing tests adapted) |
#643 | 4 |
| AD-706d | LLM-driven tier classifier for Browser Tool actions — SHIPPED Wave 163 (new tools/browser/llm_classifier.py with classify_action_with_llm companion to the existing rule-based classify_action; safety floor preserved — LLM can only UPGRADE risk; REUSES VisionLLMRateLimit under scope browser_action_classifier; in-memory cache with configurable TTL; 4 new BrowserToolConfig fields default-OFF; sync entry-point via llm_client.complete_sync; +10 pytest tests) |
#519 | 3 |
| AD-706d-2 | Counselor InterventionType pattern share with AD-561 (advances when AD-561 InterventionType API stabilises) | (forward marker, filed Wave 163) | 4 |
| AD-706d-3 | Persistent cache for AD-706d classifier (advances when in-memory cache hit rate is measurably low across operator sessions) | (forward marker, filed Wave 163) | 4 |
| AD-706e | Browser Tool action vocabulary v2 — SHIPPED Wave 166 (7 new verbs: drag tier-2 with tier-3 host escalation, key_combo tier-2 with destructive-combo tier-3 set, mouse_move tier-1 silent, mouse_button tier-2, upload_file always-tier-3 with forward-compatible AD-706f credential_ref hook, download tier-2 with executable-suffix tier-3 set, eval_js always-tier-3 with 4096-char cap; per-verb short-circuits in classify_action so AD-706f stacks additively; 3 new EventType values; +23 pytest) |
#520 | 3 |
| AD-706f | Browser Tool credential vault integration — SHIPPED Wave 166 (new tools/browser/credentials.py with CredentialScope/CredentialMetadata, CredentialVault Protocol, EncryptedFileCredentialVault v1 backend; KEK from AuthConfig.crew_scope_token via stdlib hashlib.scrypt; values encrypted with cryptography.fernet.Fernet Apache-2.0/BSD; new nested CredentialVaultConfig under BrowserToolConfig; new fill_credential action tier-3-always with https + domain-scope checks; 5 new EventType values; AD-706e upload_file.credential_ref hook now consumed; +15 pytest. New pip dep: cryptography>=42) |
#521 | 3 |
From Wave 133 deferred forward markers (AD-718 voice + AD-721 avatars follow-ups):
| AD | Title | Issue | Priority |
|---|---|---|---|
| AD-718a | Agent-authored voice profile | #522 | 3 |
| AD-718b | Coqui/ElevenLabs/Bark TTS backend via AD-705 | #523 | 3 |
| AD-718c | Per-agent wake-word | #524 | 4 |
| AD-718d | Emotional voice modulation (synergy with AD-721) | #525 | 3 |
| AD-718e | Multi-language voice selection — SHIPPED Wave 166 (new language field on VoiceProfile Pydantic-style dataclass with __post_init__ normalisation: strip → empty maps to 'en' BEFORE regex validation; backward-compat from_dict defaults missing language to 'en'; voice.ts _resolveVoiceByLanguage helper prefers voices matching the profile's language family before degrading to findPreferredVoice; new <select> language filter dropdown above the voice picker in ProfileInfoTab.tsx; piper catalog expanded with es/fr/de/it/nl/pt entries from rhasspy/piper-voices Apache-2.0/MIT; +8 pytest + 5 vitest) |
#526 | 4 |
| AD-718f / AD-735 | Per-agent volume control surface — SHIPPED Wave 156 (UI slider; backend chain shipped under AD-718) | #527 | 4 |
| AD-705d / AD-736 | Mic-permission UX polish (4-state machine + MicPermissionHint overlay) — SHIPPED Wave 156 |
#558 | 4 |
| AD-722a-3 / AD-737 | Per-agent custom emotion taxonomy (v2 — beyond fixed 8) — SHIPPED Wave 156 | #612 | 3 |
| AD-721a | Captain's avatar editor UI | #528 | 3 |
| AD-721b | Phoneme-accurate lip-sync v1 (heuristic 5-vowel viseme driver, multi-mesh) — SHIPPED Wave 138 | #529 | 3 |
| AD-721b-1 | Server-side rhubarb-lip-sync backend (replaces heuristic phoneme schedule) — SHIPPED Wave 155 | #559 | 3 |
| AD-721b-1a | ffmpeg-backed audio format conversion for client-captured audio (audio/webm) — SHIPPED Wave 166 (new _resolve_ffmpeg_binary + _convert_to_wav in avatars/rhubarb_backend.py; BF-280 subprocess.Popen + loop.run_in_executor; BF-282 tempfile output via -y <path> not stdout; generate_visemes gains keyword-only ffmpeg_binary_path; honest-degrade to BF-292 path when ffmpeg missing or conversion fails; new LipSyncConfig.ffmpeg_binary_path default tools/ffmpeg/ffmpeg operator-provided (gitignored under /tools/); +9 pytest) |
#663 | 3 |
| AD-721b-2 | Browser-side real-audio capture via MediaStreamDestination — SHIPPED Wave 155 |
#560 | 3 |
| AD-721b-2.3 / AD-738 | Server-streamed TTS via Piper (closes the lip-sync loop — server is the source of audio bytes so rhubarb runs on real WAV) — SHIPPED Wave 157 | none (was forward marker) | 3 |
| AD-738f | Per-agent voice selection (CrewProfile.voice_model + UI selector with license display) — renumbered from AD-738a (Wave 158) | none | 4 |
| AD-738g | GPU-accelerated TTS backend eval (Kokoro Apache 2.0 / StyleTTS2 MIT slot into TTSBackend Protocol) — renumbered from AD-738b (Wave 158) | none | 4 |
| AD-738h | Server-side voice modulation (apply AD-735 pitch/rate at Piper synthesis, not <audio> post-processing) — renumbered from AD-738c (Wave 158) |
none | 4 |
| AD-738i | TTS text caching layer (LRU keyed (agent_id, voice, sha256(text)) → attachment_id) — renumbered from AD-738d (Wave 158) |
none | 4 |
| AD-721b-3 | whisper.cpp WASM tiny.en model bundle for offline phoneme alignment (~75 MB model) — SHIPPED Wave 179 (foundation prompt; operator-pull script scripts/whisper-tiny-en-fetch.ps1; browser whisperLoader.ts UMD-glue lazy-loader; Python voice/whisper_model.py resolver; CognitiveConfig.whisper_model_path field; one FieldDescriptor in LLM Tiers section; NO STT exposed in this AD; +6 pytest +5 vitest) |
#561 | 4 |
| AD-705a | Offline STT via whisper.cpp WASM — SHIPPED Wave 179 (consumes AD-721b-3 loader; extends voiceActivity.ts with subscribePcm tap; new whisperStt.ts arms on VAD-bounded utterance; transcript dispatches through existing agent_chat path; STT badge in CameraLiveIndicator; CognitiveConfig.offline_stt_enabled default False; SpeechRecognition Tier-2 fallback preserved; privacy invariant — audio bytes NEVER leave browser; +5 pytest +8 vitest; +2 THIRD_PARTY_LICENSES entries — whisper.cpp MIT + Whisper weights MIT) |
#555 | 3 |
| AD-705a-1 | transformers.js Whisper alternative backend | (forward marker, planned Wave 179) | 4 |
| AD-705a-2 | IndexedDB persistent model cache | (forward marker, planned Wave 179) | 4 |
| AD-705a-3 | Lazy CDN model fetch | (forward marker, planned Wave 179) | 4 |
| AD-705a-4 | Streaming / incremental decode | (forward marker, planned Wave 179) | 4 |
| AD-705a-5 | Moonshine alternative model backend | (forward marker, planned Wave 179) | 4 |
| AD-705a-6 | Multilingual / larger Whisper models | (forward marker, planned Wave 179) | 4 |
| AD-705a-7 | Fully-offline mode (disable SpeechRecognition when offline_stt_enabled=true) |
(forward marker, planned Wave 179) | 4 |
| AD-705a-8 | Opt-in transcript audit log | (forward marker, planned Wave 179) | 4 |
| AD-705c | Custom wake-word training pipeline (openWakeWord) — SHIPPED Wave 179 (new probos wake-word collect/train/test/status CLI; WakeWordTrainer service module via loop.run_in_executor BF-280; three require_crew_scope API endpoints; WakeWordTrainerPanel HXI surface; wakeWord.ts loader prefers captain.onnx with stock fallback; WakeWordConfig Pydantic block default-off; NO openwakeword in pyproject — operator-installed; privacy invariant — training audio never leaves local runtime; +12 pytest +4 vitest; +1 THIRD_PARTY_LICENSES entry — openWakeWord Apache 2.0) |
#557 | 3 |
| AD-705c-1 | Negatives-fetch script (Mozilla Common Voice CC0) | (forward marker, planned Wave 179) | 4 |
| AD-705c-2 | Howl transfer-learning trainer | (forward marker, planned Wave 179) | 4 |
| AD-705c-3 | EfficientWord-Net few-shot trainer | (forward marker, planned Wave 179) | 4 |
| AD-705c-4 | Multi-Captain wake words | (forward marker, planned Wave 179) | 4 |
| AD-705c-5 | Counselor-suggested retrain on FAR spike (agentic-first) | (forward marker, planned Wave 179) | 4 |
| AD-721c | VR / spatial-scene avatar mode | #530 | 4 |
| AD-721d | Agent-authored appearance pipeline | #531 | 3 |
| AD-721e | Skeletal animation library (Mixamo) | #532 | 4 |
| AD-721f | Cognitive-canvas avatar replacement | #533 | 4 |
| AD-721g | Per-tier baseline VRMs — SHIPPED Wave 167 (new BaselineVRMManifest Pydantic block on AvatarsConfig with four bare-filename slots ensign/lieutenant/commander/senior; new avatars/baseline_resolver.py maps rank → filename and verifies the file exists under <avatars_dir>/_baselines/; appearance read path inserts a baseline fallback between cache synthesis and parametric; no avatar bytes ship in the repo per AD-721i-1 license posture; +9 pytest; zero new deps) |
#534 | 4 |
| AD-721h | Browser-based VRM upload UI — SHIPPED Wave 167 (new POST /api/agent/{agent_id}/appearance/vrm multipart endpoint validates glTF binary magic bytes, enforces cfg.avatars.max_vrm_size_bytes, dual-writes to AttachmentStore per AD-731 + named avatar cache via atomic os.replace, updates ProfileStore.vrm_url; HXI "Upload VRM" button + hidden file input on AgentProfilePanel; +8 pytest + 4 vitest; zero new deps) |
#535 | 3 |
From 2026-05-09 agent-authored avatar pipeline (Captain decision; AD-721d refined; pair with AD-721i in Wave 134):
| AD | Title | Issue | Priority |
|---|---|---|---|
| AD-721d (refined) | Agent-side appearance reflection cycle → DSL proposal | #531 | 2 |
| AD-721d-1 | DSL draft preview + revision cycle (Captain "request revision" + iteration cap + parametric diff highlights) — SHIPPED Wave 145 (POST /appearance/propose extended with previous_dsl + iteration counter; new DELETE /appearance/proposal-history; CrewAvatarPopout request-revision affordance + amber-tint diff highlights; +13 Python tests, +7 Vitest tests, zero new deps) |
#541 | 2 |
| AD-721d-2 | Counselor-mediated avatar revision (vs Captain-driven hint) | #621 | 4 |
| AD-721d-3 | Visual avatar preview before DSL persistence (requires AD-721i renderer) — SHIPPED Wave 167 (new POST /api/agent/{agent_id}/appearance/preview invokes the existing BlenderRenderer directly; returns SHA-256 AttachmentStore ref per AD-731; honest-degrades to 503/502 on renderer unavailability; HXI adds "Render preview" button + preview-VRM swap in CrewAvatarPopout; +8 pytest + 3 vitest; zero new deps) |
#622 | 4 |
| AD-721d-4 | Persist avatar proposal history across runtime restarts — SHIPPED Wave 161 (proposal_history.configure(path) loads + binds on-disk sidecar; mutations append/clear/reset_all persist atomically under existing _lock; AvatarsConfig.proposal_history_path defaults to <data_dir>/proposal_history.json; 5 public signatures unchanged; AD-721d-1 module-level dict + RLock unchanged) |
#620, #623 (dup) | 3 |
| AD-721d-4a | Migrate to ConnectionFactory-backed history store (advances when AD-697/698 lands a non-SQLite backend OR sidecar file > 1 MB OR a second module needs proposal-history-style restart-survival state) |
(forward marker, filed Wave 161) | 4 |
| AD-721d-4b | Periodic compaction (purge entries older than 30 days with no terminal action) (advances when sidecar growth > 256 KB/week OR any single agent's history > 100 entries) | (forward marker, filed Wave 161) | 4 |
| AD-721i | DSL → Blender VRM renderer (headless backend) | #537 | 2 |
| AD-721i-1 | License-audited starter asset pack (manifest + audit, no asset bytes v1) — SHIPPED Wave 166 (new data/avatar-assets/MANIFEST.md audit ledger with RESEARCH/REJECTED dispositions for every candidate source - 4 RESEARCH (Quaternius CC0, KayKit CC0, Khronos glTF CC0, Poly Haven CC0 + 2 hair/outfit RESEARCH) + 4 REJECTED (MakeHuman AGPL-3.0, Mixamo Adobe TOS, Ready Player Me proprietary, VRoid Studio per-file metadata) - zero APPROVED in v1; new probos.avatars.asset_manifest with AssetManifestEntry NamedTuple + AssetManifest.load/approved/by_disposition parser + validate_license whitelist {CC0, CC0-1.0, MIT, Apache-2.0, Apache 2.0, BSD, BSD-2-Clause, BSD-3-Clause, CC-BY-4.0, CC-BY}; new scripts/avatar-assets-fetch.ps1 mirrors piper fetcher (parses manifest, downloads APPROVED rows to _<category>/<name>.<ext>, SHA-256 verifies, writes ATTRIBUTION.txt for CC-BY compliance); .gitignore rules for _base_meshes/, _hair/, _outfits/, _materials/, ATTRIBUTION.txt; new docs/development/avatar-assets.md operator doc; +6 pytest) |
#542 | 3 |
| AD-721i-2 | VRoid Studio CLI alternative backend evaluation — REJECTED Wave 167 (research-only; new docs/research/vroid-cli-evaluation.md records the disposition: VRoid Studio is GUI-only proprietary EULA + Windows/macOS-only + no headless mode + non-deterministic export → three independent blocking constraints. Blender + saturday06 stays the v1 backend per AD-721i. Operator-elected VRoid output remains viable via AD-721g _baselines/ and AD-721h upload UI. Zero code, zero deps, zero tests) |
#543 | 4 |
| AD-721j | Blender Connector — Computer Use control (Anthropic-style; commercial overlay extension exists in private repo) | #538 | 3 |
From 2026-05-09 Counselor feedback (avatar feedback loop) — novel territory:
Prior-art scan (issue #545) found no public OSS project where an AI agent monitors its own avatar's render state. Open-LLM-VTuber (7.6k★), kimjammer/Neuro (1.9k★), and super-agent-party (2.2k★) all run the same one-way LLM→avatar pattern. The standard framing is "does the human perceive the avatar as natural?" — AD-722 inverts it: "does the agent know what it looks like right now?" Functional self-presence awareness for embodied agents — pattern absorption only (VTube Studio plugin shape + A2F-3D blendshape stream), no code import.
| AD | Title | Issue | Priority |
|---|---|---|---|
| AD-722 | Agent-observable avatar telemetry feedback loop (UI → agent state channel) — SHIPPED Wave 140 (read-side v1; observe_self_avatar() + GET /api/agent/{id}/avatar-telemetry + |
#545 | 3 |
| AD-722a | Intent-vs-presentation divergence detector (intended tone vs rendered weights → trust/Hebbian) — unprecedented in OSS LLM-avatar space | (forward marker, filed Wave 140) | 4 |
| AD-722a-5 | Divergence history surface in SelfImageTab — per-agent in-memory ring buffer + new /avatar-telemetry/divergence-history endpoint + PanelDivergenceHistory (aggregate %, event list). Server pre-renders OUTPUT-subject note so AD-727 #8 regex gates frontend too. SHIPPED Wave 147 |
#614 | 3 |
| AD-722b | Push channel (WebSocket) replacing 2s poll — SHIPPED Wave 142 (WS /api/agent/{id}/avatar-telemetry-stream; popout flips sampling tier to HIGH via AvatarSamplingStateMachine.enter_popout/exit_popout; UI WS-first with 5 s open-timeout poll fallback; +28 Python tests, +4 Vitest tests, zero new deps) | #568 | 3 |
| AD-722b-3 | Fine-grained snapshot-diff for WS push — SHIPPED Wave 159 (compute_diff pure-function diffing with last_observed_at skipped, default-on, every-Nth-tick full reconcile, frame type:"snapshot"/"diff" versioning, frontend merge in SelfImageTab.tsx, forward markers AD-722b-3a RFC 6902 JSON-Patch + AD-722b-3b fan-out broker) |
#600 | 3 |
| AD-722c | Avatar telemetry history for analytics — SHIPPED Wave 159 (append-only JSONL under data/avatar_telemetry/<agent_id>.jsonl; TelemetryHistoryWriter writes from _publish_loop initial + interval branches, log-and-degrade; new GET /api/agent/{id}/avatar-telemetry/history?limit=&since=; forward markers AD-722c-1 size-based rotation + AD-722c-2 TelemetryHistoryStore Protocol for commercial overlay) |
#569 | 3 |
| AD-722d | Auto-write telemetry summaries to Ship's Records (RecordsStore) — SHIPPED Wave 159 (3 v1 events: emotion_divergence_high, working_state_transition_to_blocked, sustained_silence; per-agent throttle default 3600 s; Captain opt-in via records_auto_write_enabled; Tier-2 log-and-degrade; two-phase wiring in runtime.py; forward markers AD-722d-1 operator-defined classifiers + AD-722d-2 Records dedup/aggregation) |
#570 | 3 |
| AD-722c-3 | Architect forward markers must use TECHNICAL triggers (NOT commercial-tier language) — SHIPPED Wave 160 (one bullet added to prompts/BUILDER-EXECUTION-PLAN.md Standing Rules; folded into AD-726 commit) |
#654 | 3 |
| AD-723a-3 | SensoriumEntry gains injection_zone + wrapper metadata — SHIPPED Wave 160 (backward-compatible — both fields default None; dispatcher applies wrapper to string outputs only; _DM_SELF_WRAPPED_KEYS still v1 selector; forward markers AD-723a-3a per-entry migration + AD-723a-3b zone-driven ordering) |
#626 | 3 |
| AD-723a-2 | WR branch consumer-side sensorium dispatch migration — SHIPPED Wave 161 (new _WR_SELF_WRAPPED_KEYS: ClassVar[tuple[str, ...]] = (); WR branch of _build_user_message invokes _dispatch_sensorium_async(SensoriumPath.WR_ONESHOT, ...) inside Tier-2 try/except; byte-parity preserved with empty selector; AD-723a-1 DM-branch tests still green) |
#625 | 3 |
| AD-723a-2a | Populate _WR_SELF_WRAPPED_KEYS with first real consumer (advances when any new WR-only context fragment is proposed) |
(forward marker, filed Wave 161) | 4 |
| AD-723a-3a | Per-entry migration off _DM_SELF_WRAPPED_KEYS (advances when 3+ entries gain wrapper set AND consumer code needs zone-driven iteration) |
(forward marker, filed Wave 160) | 3 |
| AD-723a-3b | Zone-driven ordering (dispatcher iterates by injection_zone when consumer requests deterministic ordering across DM/WR paths) |
(forward marker, filed Wave 160) | 3 |
| AD-722a-4 | Auto-correction loop on high-magnitude divergence — SHIPPED Wave 160 (default OFF; re-modulates prosody only; per-utterance budget; DivergenceResult gains corrected: bool; runtime.divergence_corrections cleared at reply-entry; apply_voice_modulation gains kw-only noise_scale_factor/length_scale_factor default-1.0 no-op) |
#613 | 3 |
| AD-722a-4-1 | Per-emotion correction factors (advances when divergence-history analytics show emotions need different correction strengths) | (forward marker, filed Wave 160) | 4 |
| AD-722a-4-2 | Multi-utterance correction learning (advances when correction success rate stable above 60% for 100+ corrections OR adaptive baselines required) | (forward marker, filed Wave 160) | 4 |
| AD-730-2 | Multi-image DM policy — SHIPPED Wave 160 (hard cap 8 → HTTP 413; PIL downscale to 1024px box, AD-731 invariant preserved via NEW refs; per-Captain rolling 24h budget 50 → HTTP 429 with Retry-After; no new pip deps) | #632 | 2 |
| AD-730-2-1 | Persistent budget tracker — SHIPPED Wave 161 (JSON sidecar at <data_dir>/image_budget.json, configurable via AttachmentsConfig.image_budget_path, atomic temp-file + os.replace, persisted on append AND prune, Tier-2 throughout) |
#656 | 3 |
| AD-730-2-1a | Throttle persistence writes (advances when observed write amplification on a heavy-image session exceeds 1 write per DM AND total file size > 64 KB) | (forward marker, filed Wave 161) | 4 |
| AD-730-2-1b | Migrate to ConnectionFactory-backed sidecar storage (advances when AD-697/698 Protocol lands a non-SQLite backend AND a second runtime-state-with-disk-sidecar AD also ships) |
(forward marker, filed Wave 161) | 4 |
| AD-730-2-2 | Per-agent_type budget override (advances when analytics workloads need higher budgets than dialogue agents) | (forward marker, filed Wave 160) | 4 |
| AD-722b-4 | Fleet-level avatar telemetry stream (one WS, fan-out by agent_id) — SHIPPED Wave 160 (new endpoint /api/agent/avatar-telemetry/stream; per-agent endpoint preserved; every frame carries agent_id; fleet_stream_enabled default-ON; HXI hook stub useFleetAvatarTelemetry; per-agent store migration deferred to AD-722b-4a) |
#601 | 3 |
| AD-722b-1 | Crew-scope auth substrate for telemetry surfaces — SHIPPED Wave 161 (new src/probos/routers/auth.py with require_crew_scope HTTP Depends and verify_ws_token pre-accept WS gate; hmac.compare_digest constant-time compare; new AuthConfig.crew_scope_token: str = "" default-OFF; applied to 4 endpoints — 2 HTTP + 2 WS; first auth substrate in the codebase) |
#598 | 3 |
| AD-722b-1a | MagicMock(spec=SystemConfig) test fixture cleanup; remove routers/auth.py defensive guard — SHIPPED Wave 162 (7 sites migrated to real SystemConfig(); isinstance(token, str) guard removed from _configured_token; 3 additional test helpers got cfg.auth = AuthConfig() to preserve empty-token=auth-disabled contract; net test delta 0) |
#657 | 3 |
| AD-729a | Peer-observation Standing Orders extension — SHIPPED Wave 162 (new config/standing_orders/peer_observation.md with 5 sections verbatim from Captain ruling; cross-references from ship.md and counselor.md; +7 pytest tests; unblocks AD-729 capability AD) |
#588 | 3 |
| AD-720d-2.1 | Captain vision-capability approval flow — SHIPPED Wave 162 (3 new endpoints vision-capability/{propose,approve,history}; new CallsignRegistry.set_vision_capable; new avatars/vision_proposal_history.py sidecar; 2 new EventType values; +8 pytest tests; AD-731 invariant preserved) |
#645 | 3 |
| AD-720d-2.1a | HXI UI surface for Captain pending-approval list (advances when AD-720d-2.1 ships AND Captain operates ProbOS for >7 days with multiple pending vision-capability proposals queued) | (forward marker, filed Wave 162) | 4 |
| AD-720d-2.1b | Auto-deny TTL when Captain unresponsive for >N hours (advances when ProbOS adopts an autonomous-Captain mode) | (forward marker, filed Wave 162) | 4 |
| AD-706c-1 | Browser Tool visual verify via vision tier — SHIPPED Wave 162 (new verify(expectation) action on BrowserTool, tier-1; screenshot via AttachmentStore.write SHA-256 ref; vision LLM call returns {ok, observation}; honest-degrade when vision tier unconfigured/unavailable; new EventType.BROWSER_VERIFY_OBSERVED; +10 pytest tests; AD-731 invariant preserved) |
#642 | 3 |
| AD-706c-1a | Journal aggregation for verification pass/fail rates (advances when AD-674 graduated-initiative calibration needs the signal) | (forward marker, filed Wave 162) | 4 |
| AD-706c-3 | Cloud vision API integration — Anthropic computer-use beta (advances when an operator configures a cloud key AND opts in via explicit flag) | (forward marker, filed Wave 162) | 4 |
| AD-722a-1 | Vision-LLM intent-vs-render divergence detector — SHIPPED Wave 162 (new avatars/vision_intent_divergence.py with detector + VisionLLMRateLimit shared with AD-722e-2 + is_render_phrased AD-727 #8 enforcer; default-OFF flag; 3/hr/agent cap; runtime-constructed; DivergenceDetector callsite wiring deferred until AD-721i ships) |
#610 | 3 |
| AD-722a-1a | HXI surface for vision-divergence events in SelfImageTab (advances when AD-721i backend renderer ref lookup is stable AND vision_intent_divergence_enabled flips True) | (forward marker, filed Wave 162) | 4 |
| AD-722e-2 | Vision-LLM self-render coherence verifier — SHIPPED Wave 162 (new cognitive/self_render_verify.py REUSING AD-722a-1's VisionLLMRateLimit + is_render_phrased; default-OFF; 3/hr/agent cap; AD-727 rule #1 read-only-on-trust verified by source-scan test; AD-731 invariant preserved; self_perception.py wire-up deferred until AD-721i ships) |
#644 | 3 |
| AD-722e-2a | HXI SelfImageTab surface for render-coherence observations (advances when AD-721i ships AND self_render_verify_enabled flips True) | (forward marker, filed Wave 162) | 4 |
| AD-722a-2 | Chain-path divergence detection at compose-step emit — SHIPPED Wave 162 (new canonical CognitiveAgent.mark_chain_output_emitted hook + chain_divergence_buffer_for accessor; per-audience ring buffer maxlen=8; wired from chain compose consumer at cognitive_agent.py:2934; new EventType.DIVERGENCE_OBSERVED_CHAIN with path_tag="chain"; +10 pytest tests; AD-722a DM-path unchanged) |
#611 | 3 |
| AD-722a-2a | Thread intent_self_tag and applied_modulation_rules through _execute_sub_task_chain (advances when chain phases reliably populate these signals) |
(forward marker, filed Wave 162) | 4 |
| AD-721d-2 | Counselor-mediated avatar revision — SHIPPED Wave 162 (new mediate_appearance_revision intent on CounselorAgent + _mediate_appearance_revision handler; new POST /api/agent/{id}/appearance/mediate endpoint using intent_bus.send(IntentMessage(target_agent_id=...)); new EventType.APPEARANCE_REVISION_MEDIATED; +8 pytest tests) |
#618 | 3 |
| AD-721d-2a | source field on ProposalEntry when AD-721d-1 doesn't carry one (advances when Captain audit signal is needed) |
(forward marker, filed Wave 162) | 4 |
| AD-721d-2b | Per-domain mediator selection (Engineering officer mediates engineering avatars) (advances when >=2 domain agents need their own avatar palettes mediated) | (forward marker, filed Wave 162) | 4 |
| AD-721d-2c | HXI Counselor-mediation button in CrewAvatarPopout — SHIPPED Wave 163 (new onMediateRevision + counselorOnline props on CrewAvatarPopout.tsx; new inline-SVG mediate button rendered conditionally; refined-hint inline panel with iteration chip; error surface preserves Captain's original hint; +4 vitest tests; bundle index-a4x_HPw3.js → index-cAfin0aS.js) |
#658 | 3 |
| AD-719b | Copilot-style left rail + Agents nav — SHIPPED Wave 163 (new components/leftrail/LeftRail.tsx self-contained presentational shell; default-OFF via localStorage hxi_left_rail_enabled; collapse toggle 240↔56px persisted to hxi_left_rail_collapsed; progressive disclosure max 5/3 first-time vs 12/8 veteran via hxi_visit_count; 3 stroke-based inline-SVG glyphs; +5 vitest tests; bundle unchanged - parent wire deferred) |
#547 | 3 |
| AD-719b-parent-wire | Import LeftRail into App.tsx + wire zustand stores for online agents and recent threads (advances after AD-719b ships AND parent layout has the slot reserved) | (forward marker, filed Wave 163) | 4 |
| AD-719b-2 | Flip hxi_left_rail_enabled default to True (advances when Captain has used the left rail across ≥5 sessions) |
(forward marker, filed Wave 163) | 4 |
| AD-719a | Persistent multi-agent chat threads under WardRoom — SHIPPED Wave 163, contract only (new ward_room/multi_agent.py with MULTI_AGENT_THREAD_MODE, create_multi_agent_thread, participants trailer + parsers + visibility helpers; architectural decisions: Captain-seeded, in-thread cross-agent observation YES, cross-thread NO; new MULTI badge in WardRoomThreadList.tsx; store type extended; +6 pytest + 4 vitest tests) |
#546 | 3 |
| AD-719a-wire | Wire AD-719's transient fan-out into the AD-719a thread persistence layer (advances when AD-719a contract has been validated by ≥3 distinct multi-agent threads in operation) | (forward marker, filed Wave 163) | 4 |
| AD-719a-2 | Agent-to-agent multi-agent messages without Captain seed (advances when ≥10 multi-agent threads exist with cross-agent visibility working) | (forward marker, filed Wave 163) | 4 |
| AD-719a-3 | Cross-thread observation gated by AD-729 peer-perception (advances when AD-729 capability is default-ON for crew) | (forward marker, filed Wave 163) | 4 |
| AD-720a-1 | PDF / DOCX / XLSX document text extraction — SHIPPED Wave 162 (3 new permissive deps pypdf BSD-3 + python-docx MIT + openpyxl MIT; dispatch table in text_extractor.py; page/row/byte caps; Tier-2 parser-exception bubbling; vision_dispatch.py PDF gate extended to DOCX/XLSX; default-OFF flag AttachmentsConfig.pdf_extraction_enabled; +12 pytest tests; AD-731 invariant preserved) |
#562 | 3 |
| AD-720a-1-1 | Flip pdf_extraction_enabled to True after operator feedback confirms extraction quality |
(forward marker, filed Wave 162) | 4 |
| AD-720a-1-2 | OCR pipeline for scanned PDFs (image-bearing pages) | (forward marker, filed Wave 162) | 4 |
| AD-720b | Chat tool attach (capability grants for BrowserTool / MCP) — SHIPPED Wave 167 (new POST /api/chat/tool-grant reuses ToolPermissionStore.issue_grant so a grant issued via chat is indistinguishable on disk from one issued via /tool-access grant; ToolPermission imported from probos.tools.protocol; tool_registry.get() is the real API used; HXI /grant <agent_id> <tool_id> <permission> [hours] slash-command parser in IntentSurface; +11 pytest + 4 vitest; zero new deps) |
#550 | 3 |
| AD-720c | Cloud file picker (OAuth + token vault) — SHIPPED Wave 168 (4 new endpoints under /api/cloud-pickers/{provider} — start/callback/files/attach; new src/probos/cloud_pickers/ module with Protocol + Google Drive v1 + OneDrive/Dropbox stubs; OAuth tokens persisted in AD-706f CredentialVault as JSON-serialized OAuthTokenBundle under ref cloud_provider:{provider}:{captain_id} with CredentialScope() empty-frozenset = captain-only; access_type=offline+prompt=consent on Google consent URL so refresh_token is issued; refresh-on-401 retry-once helper; CsrfStateStore in-memory single-consume TTL set; new CloudPickersConfig + per-provider CloudPickerProviderConfig default-OFF; AD-731 invariant preserved — downloaded bytes flow through _validate_and_store_attachment → AttachmentStore.write(sha, blob, mime), browser receives SHA ref only; new CloudPicker.tsx modal in HXI; +22 pytest +7 vitest; zero new pip/npm deps — httpx + cryptography already resident) |
#551 | 3 |
| AD-720c-1 | OneDrive provider implementation (advances on operator demand once Google Drive v1 has been exercised end-to-end in production for at least one wave) | (forward marker, filed Wave 168) | 4 |
| AD-720c-2 | Dropbox provider implementation (advances on operator demand once Google Drive v1 has been exercised end-to-end in production for at least one wave) | (forward marker, filed Wave 168) | 4 |
| AD-728 | Vision-LLM render-coherence mirror function — SHIPPED Wave 163 (new avatars/render_verification.py with RenderCoherenceResult + module-level verify_render_coherence(runtime, agent_id, trigger, ...) REUSING AD-722a-1's VisionLLMRateLimit scope render_verification + is_render_phrased; three triggers — captain_command slash /verify-render, divergence_followup gated by new render_verification_followup_enabled flag, agent_initiated_stub hard-rejected; default-OFF render_verification_enabled; 3/hr/agent cap; phrasing re-prompt-then-drop; cost-discipline coherent observations not logged; new EventType.RENDER_DIVERGENCE_OBSERVED; AD-731 + AD-727#1 source-scan tests; +15 pytest tests) |
#586 | 3 |
| AD-728a | Richer coherence scoring — replace string-compare baseline with embedding-distance scoring (advances when RENDER_DIVERGENCE_OBSERVED event volume exceeds 50 events/quarter) |
(forward marker, filed Wave 163) | 4 |
| AD-728b | Auto-correction proposals for render divergence (advances when AD-728a embedding scoring is stable AND drift pattern catalog has ≥10 distinct categorized causes) | (forward marker, filed Wave 163) | 4 |
| AD-728c | Agent-initiated render self-check with contextual rate limits — SHIPPED Wave 164 (flips AD-728's agent_initiated_stub from hard-reject to gated two-budget rate-limited path; default-OFF render_self_check_enabled; per-active-conversation budget default 2 applies INSTEAD OF hourly budget default 3 when last_reply_emitted_at within render_self_check_active_window_seconds default 600s; new async CognitiveAgent.check_own_render(reason) folds every outcome into working memory via AgentWorkingMemory.record_observation; event-bus cost discipline preserved — coherent agent-initiated calls do NOT emit; _last_reply_emitted_at uses public runtime.registry.get(agent_id) API per BF-287; existing AD-728 hard-rejected test renamed to default-off-preserves-baseline; +12 pytest tests; AD-731 + AD-727#1 source-scans preserved) |
#660 | 3 |
| AD-728c-1 | Per-conversation budget reset on Captain-acknowledged correction (advances when AD-572 produces per-conversation correction signals AND AD-728c telemetry shows agents exhausting per-conversation budget before resolving a divergence) | (forward marker, filed Wave 164) | 4 |
| AD-728c-2 | Counselor mediation of self-check requests (advances when AD-721d-2 generalized to render self-checks AND ≥3 distinct agents have requested self-checks in production traffic) | (forward marker, filed Wave 164) | 4 |
| AD-728c-3 | VisionLLMRateLimit._windows per-conversation bucket GC (advances when _windows size exceeds 1000 entries in production OR any other AD reuses the render_self_check_conv:<ts> scope pattern) |
(forward marker, filed Wave 164) | 4 |
| AD-728d | Self-image-awareness skill (LLM-discoverable self-check capability) — SHIPPED Wave 165 (new config/skills/self-image-awareness/SKILL.md augmentation skill scoped to direct_message,ward_room_notification,proactive_think intents; new [SELF_CHECK reason] bracket marker with [a-z_-]{1,64} reason grammar — invalid reasons silently strip with no dispatch; new DmSanityGate._SELF_CHECK_RE/_SELF_CHECK_STRIP_RE/extract_self_check/strip_self_check adjacent to existing CHALLENGE/MOVE pair; new DmReplyPipeline.step_4_self_check_parse inserted between move-parse and episodic-store with renumber of trailing steps to 5..9; fire-and-forget asyncio.Task reference held on DmReplyContext._self_check_task per fire-and-forget GC rule; first marker dispatches agent.check_own_render(reason=...), additional markers stripped with single WARNING; closes AD-728c discoverability gap surfaced by Counselor 2026-05-16; +7 pytest tests; AD-731 invariant preserved — text-only marker, no inline blob introduction) |
#661 | 3 |
| AD-729 | Peer avatar perception governance contract — SHIPPED Wave 163 (new avatars/peer_perception.py exports ObservationRegister/PeerObservation/async observe_peer/async request_permission/composite_impressions_for; four mechanical floors — reputation/routing read-only, observed opt-out via CrewProfile.peer_perception.enabled, backend-render-only, cross-federation federation_review_required honest-degrade; new PeerPerceptionProfile dataclass on CrewProfile; 5 new EventType values; 3 new AvatarsConfig fields default-OFF; persistence via records_store.write_entry artifact peer_observation; permission grants single-use 5-min TTL deny-silent default; +18 pytest tests with real AgentRegistry-shape fixture per BF-287; AD-731 invariant preserved) |
#587 | 3 |
| AD-729-impressions-hookup | Wire composite_impressions_for into project_self_perception (advances when AD-729a Standing Orders ship AND ≥1 officer certified per AD-729b) |
(forward marker, filed Wave 163) | 4 |
| AD-729-capability-flip | Flip peer_perception_enabled default to True for crew agents (advances when AD-729a ships AND ≥3 officers passed AD-729b certification) |
(forward marker, filed Wave 163) | 4 |
| AD-722a-6 | Cross-agent intent-vs-presentation divergence observation — SHIPPED Wave 163 (new async observe_peer_divergence in avatars/peer_perception.py consuming AD-722a-1 runtime.divergence_history and delegating to AD-729 observe_peer for governance; pure-template _format_divergence_summary; three pre-delegation gates + AD-729's eight; dual default-OFF flags cross_agent_divergence_observation_enabled AND peer_perception_enabled; new EventType.CROSS_AGENT_DIVERGENCE_OBSERVED; +10 pytest tests; AD-731 invariant preserved) |
#615 | 3 |
| AD-722a-6-flip | Flip cross_agent_divergence_observation_enabled default to True for OPERATIONAL register (advances when AD-729a Standing Orders ship AND AD-729 capability is default-ON for crew) |
(forward marker, filed Wave 163) | 4 |
| AD-729b | Peer-observation conduct training module — SHIPPED Wave 163 (new cognitive/peer_observation_training.py with load_module/grade_module/peer_observation_graduation_gate/set_peer_observation_certified; new config/manuals/peer_observation_conduct.yaml scaffold with six AD-729b sections + weighted rubric; 2 new EventType values; 2 new QualificationConfig fields default-OFF; mirrors AD-720d-2.1 set_vision_capable mutation API; +8 pytest tests) |
#589 | 3 |
| AD-729b-2 | LLM-graded training module (advances when AD-729b deterministic rubric has graded ≥10 officers AND consistency verified) | (forward marker, filed Wave 163) | 4 |
| AD-729b-flip | Flip peer_observation_certification_required default to True (advances when AD-729a Standing Orders ship AND AD-729b module content is complete with Counselor sign-off) |
(forward marker, filed Wave 163) | 4 |
| AD-729c | Counselor pattern-monitoring for peer-observation conduct — SHIPPED Wave 163 (new cognitive/peer_observation_monitor.py with PatternFinding + Protocol + seven concrete detectors + aggregate_health_metrics + PeerObservationMonitor orchestrator; fixed 60s cadence via module constant; three-tier escalation with sidecar JSON state persistence; 4 new EventType values; Counselor-own-conduct + trust-read-only source-scan tests; +23 pytest tests) |
#590 | 3 |
| AD-729c-1 | Sampling interval as ClinicalTelemetryConfig field (advances on first operator request to tune the 60s cadence OR first production data showing the rate is wrong) | (forward marker, filed Wave 163) | 4 |
| AD-729c-2 | LLM-assisted phrasing-drift detection (advances when regex RegisterDriftDetector produces ≥20 findings with ≥80% manual-review precision) |
(forward marker, filed Wave 163) | 4 |
| AD-729c-3 | Cross-mesh pattern detection (advances when federation peer-observation arrives via AD-480 review path) | (forward marker, filed Wave 163) | 4 |
| AD-729c-tier1-wire | Wire Tier-1 events to the Counselor 1:1 message channel (advances when AD-635 message-channel API stabilises for in-test wiring) | (forward marker, filed Wave 163) | 4 |
| AD-729c-tier3-wire | Wire Tier-3 events to AD-635 bridge alert API (advances when AD-635 alert-creation signature is grep-verifiable on the runtime) | (forward marker, filed Wave 163) | 4 |
| AD-729d | Peer-observation reinforcement loop — FORWARD MARKER (Wave 163 doc-only). Advances to a build prompt when ALL of: (A) PEER_OBSERVATION_RECORDED count ≥100 across ≥3 observer/observed pairs over a continuous 2-quarter window; (B) zero PEER_OBSERVATION_INTERVENTION_TIER_3 events across the same window AND no Tier-2 escalated to Tier-3 retry; (C) AD-729a Standing Orders shipped AND content includes a reinforcement-specific section reviewed by Counselor; (D) AD-729b module YAML includes reinforcement content AND ≥3 officers passed the extended module; (E) Captain explicit ruling at design stage documented in DECISIONS.md |
#591 | 4 |
| AD-722b-5 | Federation cross-mesh telemetry push — SHIPPED Wave 162 (LOCAL-MESH PORTION ONLY) (new federation/telemetry_relay.py with FederationTelemetryRelay + PeerTelemetrySubscription; subscription registration + per-peer outbound rate-limit + agent_id filter + pluggable emit callback; +8 pytest tests; federation hop forward-marked AD-722b-5a) |
#602 | 3 |
| AD-722b-5a | Wire FederationTelemetryRelay.set_emit_callback to FederationBridge.forward_telemetry (advances when AD-480e/g matures the bridge with a streaming/relay primitive — the bridge today exposes only forward_intent single-shot RPC) |
(forward marker, filed Wave 162) | 4 |
| AD-722b-5b | HXI surface to render remote agents with origin_mesh_id badge (advances when AD-722b-5a ships AND multi-mesh deployments are in production) |
(forward marker, filed Wave 162) | 4 |
| AD-722b-1b | Apply require_crew_scope to remaining read endpoints (chat history, agent profile, etc.) (advances when AD-722b-1 ships AND any auth-required-endpoint feature request lands) |
(forward marker, filed Wave 161) | 4 |
| AD-722b-1c | Federation-bridge JWT verification (advances when AD-480 federation framework adds cross-mesh agent reads) | (forward marker, filed Wave 161) | 4 |
| AD-722b-1d | Token rotation + TTL (advances when any single deployment runs > 90 days with a static secret OR security scanner flags long-lived shared-secret use) | (forward marker, filed Wave 161) | 4 |
| AD-722b-4a | HXI fleet-hook integration — SHIPPED Wave 161 (useFleetAvatarTelemetry wired into CognitiveCanvas.tsx; new useStore.avatarTelemetry Map + setAvatarTelemetryFrame action; snapshot/diff/ping/error frame handling; per-agent SelfImageTab WS unchanged; bundle hash changed index-BDgoocuQ.js → index-D0tUvFeA.js proving new code ships) |
#655 | 4 |
| AD-722b-4b | Migrate SelfImageTab.tsx per-agent WS consumer to read from useStore.avatarTelemetry (advances when avatarTelemetry map reaches 2+ canvas consumers AND fleet endpoint snapshot+diff parity with per-agent endpoint is verified by integration test) |
(forward marker, filed Wave 161) | 4 |
| AD-722b-4c | Canvas-side selectors useAgentEmotion(agent_id) + useAgentWorkingState(agent_id) (advances when more than one canvas component reads avatarTelemetry directly AND re-render cost becomes measurable) |
(forward marker, filed Wave 161) | 4 |
| AD-722b-4-1 | Dynamic crew membership during fleet-stream lifetime (advances when crew spawn/despawn during stream is observed in production) | (forward marker, filed Wave 160) | 4 |
| AD-726 | DM post-LLM cleanup chain extracted into DmReplyPipeline (8 ordered steps; agent_chat shrinks 574→~305 lines) — SHIPPED Wave 160 (partial close of #584) (pre-LLM DmContextPrep → AD-726a, DmPromptAssembler → AD-726b, frozen cross-phase shapes + snapshot fixture suite → AD-726c) |
#584 | 3 |
| AD-726a | Pre-LLM DmContextPrep extraction (AD-725 targeted-recall, AD-730 vision-message build, AD-720d text augmentation, AD-722 self-observation refresh) |
(forward marker, filed Wave 160) | 3 |
| AD-726b | DmPromptAssembler extraction from cognitive_agent.py:_build_user_message DM branch |
(forward marker, filed Wave 160) | 3 |
| AD-726c | Frozen cross-phase shapes (DmObservation, DmReply) + byte-identical snapshot fixture suite |
(forward marker, filed Wave 160; advances when AD-726a + AD-726b have both landed) | 3 |
| AD-722e | Visual self-perception via image rendering (agent compares rendered avatar vs intent) | (forward marker, filed Wave 140) | 4 |
| AD-722-1 | Modulation rule table → YAML/JSON manifest (TS + Python single source of truth; closes byte-parity duplication) | (forward marker, filed Wave 140) | 4 |
| AD-723 | Ship as persistent shared virtual space — co-presence protocol, agent-position state, multi-agent extension of AD-722 telemetry (extension point) | (filed after AD-722) | 4 |
| AD-723-C (Commercial) | Polished multi-user 3D ship-interior UI, world layout, room semantics, meeting surfaces, operator controls — builds on AD-723 protocol | (commercial overlay) | — |
| AD-724 | Away-mission protocol — agent embodiment in external virtual worlds (VRChat / open metaverse), episodic memory of places | (filed after AD-723) | 4 |
Wave 151 / Wave 152 — vision DM payload chain:
| AD | Title | Issue | Priority |
|---|---|---|---|
| AD-730 | Vision pipe-through for per-agent DMs — SHIPPED Wave 151, partial regression resolved by AD-731 (Wave 152) | #630 | 2 |
| AD-731 | Content-addressable vision payloads (refs not bytes on the bus; receiver dereferences from AttachmentStore just before HTTP POST) — SHIPPED Wave 152 (12 new tests + BF-265/BF-266/AD-730 fixture assertions inverted; +13 net) | #637 | 1 |
| AD-637z2 | Remove BF-265 transport strip after AD-731 lands — CLOSED-AS-PART-OF-AD-731 (Wave 152) | #639 | 1 |
| AD-731a | Cross-host attachment distribution (parent forward marker; single-host store assumption deferred from AD-731) | #638 | 3 |
| AD-731a-1 | HTTP fetch for cross-host single-tenant attachment retrieval | (sub-marker under AD-731a) | 3 |
| AD-731a-2 | NATS Object Store integration for cross-mesh attachment distribution; retires federation/bridge.py vision_messages strip | (sub-marker under AD-731a) | 3 |
| AD-731a-3 | Mime-only fast path in sender (skip blob read for image attachments) | (sub-marker under AD-731a, optional) | 4 |
| AD-732 | Dedicated vision LLM tier + honest degrade (vision is the fourth peer of fast/standard/deep; AttachmentsConfig.vision_tier default flips to "vision"; unconfigured OR unhealthy returns VISION_UNCONFIGURED_MESSAGE/VISION_UNHEALTHY_MESSAGE) — SHIPPED Wave 153 (15 new tests; +15 net) |
#640 | 1 |
| AD-732a | Per-agent vision tier override (agent.vision_tier config — different model for an Imaging Officer than the rest of the crew) |
(forward marker) | 4 |
| AD-732b | Vision tier autodetect on startup (probe localhost:11434 and auto-uncomment qwen3.6:27b if available — zero-config OSS magic) | (forward marker) | 4 |
| AD-732c | Vision tier hot-reload on config change (operator edits system.yaml; vision tier reloads without restart) | (forward marker) | 4 |
| AD-733 | Live camera stream perception (umbrella). HXI samples webcam frames → AttachmentStore → vision_observation intent on the bus → ObserverAgent maintains visual working memory + emits configured visual events. Same wire format / vendor adaptation as AD-731/BF-268; the new layer is cadence and proactive policy. Demo-grade capability — paired with AD-721 avatar this is what makes the mesh feel alive. |
#641 | 4 |
| AD-733a | Real-time vision tier (llm_model_vision_fast for sub-1s per-frame inference + llm_model_vision_deep for occasional narrative summaries) + agent visual working memory (last-N-frames hot buffer). Identity matching against the Captain Card avatar (AD-739) — "person in frame" → "the Captain in frame" when avatar_ref matches. |
(sub-marker under AD-733) | 4 |
| AD-733b | ObserverAgent + proactive event surfacing (graduated initiative for visual events; AD-674/AD-675 calibration extended from text to perceptual triggers) | (sub-marker under AD-733) | 4 |
| AD-739 | Captain Card — operator self-card always-in-context across all CognitiveAgent prompts — SHIPPED Wave 163, contract only (new captain_card/card.py with CaptainCard Pydantic model + CorrectionRef + default_captain_card bootstrap + load_card/save_card atomic JSON sidecar + render_card_for_prompt pure-template renderer with truncation; AD-588/589/592 confabulation-guard via _CAPABILITY_GAP_RE; AD-731 SHA-256 validator on avatar_ref; 4 new CognitiveConfig fields default-ON; +10 pytest tests; prompt-builder injection + Dreaming loop wire-up deferred to forward markers AD-739-prompt-wire / AD-739-dreaming-wire) |
#649 | 3 |
| AD-739-prompt-wire | Inject render_card_for_prompt output into prompt_builder.build_system_prompt and the per-CognitiveAgent system-prompt assembly (advances when Captain validates the rendered Card content) |
(forward marker, filed Wave 163) | 4 |
| AD-739-dreaming-wire | Wire the Captain Card refresh into the Dreaming consolidation loop using captain_card_refresh_min_interval_seconds throttle (advances when AD-739-prompt-wire ships AND ≥10 high-importance correction episodes accumulate) |
(forward marker, filed Wave 163) | 4 |
| AD-740 | Affect-vs-intent drift trend summary (window/threshold over the AD-722a-5 ring buffer) — SHIPPED Wave 169 (new avatars/affect_drift.py pure read-only summariser; AvatarsConfig.affect_drift_default_window=8 + affect_drift_threshold=0.7; check_own_render folds drift summary into working memory; +8 pytest; forward markers AD-740-1/-2/-3) |
#664 | 3 |
| AD-740-1 | Auto-correction of sustained drift (longest_streak ≥ 4 → corrective intervention pathway) | (forward marker, filed Wave 169) | 4 |
| AD-740-2 | Cross-agent drift comparison surface (counselor-mediated clinical pattern detection across ≥2 agents) | (forward marker, filed Wave 169) | 4 |
| AD-740-3 | Persistence beyond in-memory ring (SQLite sidecar for longitudinal drift study across process restarts) | (forward marker, filed Wave 169) | 4 |
| AD-730-3 | Agent image generation in DM replies via [GEN_IMAGE prompt] bracket marker — SHIPPED Wave 169 (new cognitive/image_gen_dispatch.py OpenAI Images API v1 client + new sixth peer image_gen LLM tier + AD-731-compliant AttachmentStore.write SHA refs on the response payload + step_4c_image_gen_parse letter-suffix pipeline step + AD-727 first-invocation wellness review WARNING + AD-541b anchored episode + full eight-guard audit (_LLM_TIERS extended, _TIER_ORDER excluded, ModelRouter bypassed, LLMResponseCache bypassed, all 8 tier_config maps extended, health probe scaffolding extended); 5 AvatarsConfig knobs default-OFF; +20 pytest; zero new deps; forward markers AD-730-3-1/-2/-3/-4/-5) |
#633 | 2 |
| AD-730-3-1 | Per-conversation + per-day cost gating budget for image_gen (Captain ACK on overrun) | (forward marker, filed Wave 169) | 4 |
| AD-730-3-2 | Image moderation classifier (NSFW / safety / policy review on generated images) | (forward marker, filed Wave 169) | 4 |
| AD-730-3-3 | Provenance watermarking + C2PA-shape metadata embedding on generated images | (forward marker, filed Wave 169) | 4 |
| AD-730-3-4 | HXI rendering of agent-generated attachment_ids on the DM reply surface |
(forward marker, filed Wave 169) | 4 |
| AD-730-3-5 | Ward Room wiring of [GEN_IMAGE ...] bracket marker in the WR reply pipeline |
(forward marker, filed Wave 169) | 4 |
| AD-741 | Settings / Control Panel HXI shell — SHIPPED Wave 170 (new /api/config GET/GET-yaml/POST endpoints + single-consume CSRF token; new src/probos/settings/section_registry.py SECTIONS tuple of 10 wired sections across 4 domains; runtime.config_path attribute added; secret-field rule with three-layer defense (GET redaction + YAML scrub + POST rejection); new HXI overlay SettingsPanel wired into TopNav; +19 pytest; +10 vitest; zero new deps; forward markers AD-741-1/-2/-3/-4/-5/-6/-7) |
(new AD; no GH issue) | 2 |
| AD-733 | Camera streaming v1 — frame ingestion + vision_observation IntentDescriptor + Perception Settings section — SHIPPED Wave 170 (new PerceptionConfig/CameraStreamConfig default-OFF both layers; new src/probos/perception/ module registers VISION_OBSERVATION_DESCRIPTOR + inserts Perception section into AD-741 registry; new /api/perception/camera/frame multipart endpoint reuses _validate_and_store_attachment chain — AD-731 invariant enforced; per-session token-bucket rate limiter; AD-541b anchored episode on first frame; new useCameraStream hook + persistent CameraLiveIndicator + Settings PerceptionLivePanel; v1 has NO LLM consumer by design — AD-733a covers it; +13 pytest; +5 vitest; zero new deps — all browser-native APIs; forward markers AD-733a/-b/-1/-2 filed as new issues #665/#666/#667/#668) |
#641 | 2 |
| AD-733a | Vision consumer + supervisor + working memory + DM context injection — SHIPPED Wave 171 (three new src/probos/perception/ modules: supervisor.py pluggable Strategy Protocol + aHash+throttle default; working_memory.py per-agent ring buffer with BF-294 confabulation-guard empty-buffer sentinel; consumer.py runtime-owned VisionConsumer reuses AD-732 vision tier via build_multimodal_messages; 5 new PerceptionConfig fields; finalize wiring iterates runtime.registry.all() per BF-287; routers/agents.py prepends scene block ahead of bus send, gated on perception.enabled; NeuralCompanion three-tier pattern absorbed (MIT, architecture only); AD-731 invariant enforced via source-scan test; +19 pytest + 1 Captain acceptance test) |
#665 | 2 |
| BF-298 | Settings parent/child toggle UX + perception status badge — SHIPPED Wave 171 (UI-only; FieldRow gains disabled/disabledReason threaded through all five input branches with aria-disabled + title tooltip + opacity 0.4 + not-allowed cursor; section render loop computes per-field disabled for perception by reading master perception.enabled from draft fallback snapshot — master stays clickable; PerceptionLivePanel gains 3-state perception-status-badge row — OFF/0-modalities/camera-live; HXI Design Principle #3 honored — no emoji, uppercase mono + stroke-only colors; +6 vitest) |
(BF — no GH issue) | 2 |
| AD-733b | ProactiveVisionObserver + identity hook — SHIPPED Wave 171 (new src/probos/perception/observer.py with two triggers — scene_introduction (once per camera session) + high_novelty (bounded by max_emissions/dwell/threshold); observer sends synthesized [SYSTEM-INITIATED: ...] user-turn so agent's own LLM composes the visible reply; identity hook in VisionConsumer asks vision LLM one word — captain/other/unknown — against captain_avatar_ref; five new PerceptionConfig fields; finalize wires observer onto consumer when both enabled; +10 pytest; six AD-742 forward markers cited) |
#666 | 2 |
| AD-733-1 | AttachmentStore retention policy + reaper for camera frames (vision_observation tag) | #667 | 4 |
| AD-733-2 | Passive screen sensing via getDisplayMedia — vision_observation source="screen", per-source HXI toggle + rate limit, SCREEN LIVE indicator alongside CAMERA LIVE, new PerceptionConfig.screen sub-block (multi-camera narrowed out — already shipped via AD-742c per-agent bindings). Wave 178 draft. |
#668 | 2 |
| AD-733-2-1 | VisionConsumer per-source novelty thresholds (vision_screen_novelty_threshold separate from camera) |
(forward marker, planned Wave 178) | 4 |
| AD-733-2-2 | Real-time WebRTC screen track ingestion (replaces multipart frames) | (forward marker, planned Wave 178) | 4 |
| AD-744 | Interactive share-to-agent — one-shot DM-attached screen frame; "Share screen to {agent}" button on DM composer; one getDisplayMedia capture per click; reuses existing /camera/frame endpoint with source=screen + force=true + agent_ids=<id> (bypasses supervisor novelty gate per BF-302); inline attachment to next DM turn via existing attachment_ids. Wave 178 draft. |
(new top-level; GH issue filed at wave close per AD-722c-3) | 2 |
| AD-744-1 | Cross-agent share fan-out (share-to-many) | (forward marker, planned Wave 178) | 4 |
| AD-744-2 | Region masking / redaction primitive before share | (forward marker, planned Wave 178) | 4 |
| AD-744-3 | In-HXI preview modal with redact-region affordance | (forward marker, planned Wave 178) | 4 |
| AD-745 | Conversation -> action handoff (browser scope v1; Playwright dispatch) — [ACTION: <json>] bracket marker on DM replies; new step_5_action_dispatch pipeline stage; dispatch via existing AD-706 BrowserTool _HANDLERS; tier 1/2/3 ladder via existing classify_action; new BrowserToolConfig.destructive_url_patterns allow-list (financial/auth/admin) that forces matched URLs to tier-3; new endpoints POST /api/browser/actions/{id}/ack//abort/GET /api/browser/actions/{thread_id}; new AgentActionLog HXI component (collapsed default per HXI #5) + tier-2 ACK card + tier-3 destructive-confirmation modal; per-action stop affordance. Agent acts in BrowserSession's isolated Chromium context (NOT Captain's logged-in profile — safety invariant). AD-541b anchor with new channel="action" + trigger_type="agent_action_executed" + before/after frame refs. Default-OFF Wave 10 #14. Wave 178 draft. |
(new top-level; GH issue filed at wave close per AD-722c-3) | 2 |
| AD-745-1 | DesktopActionTool OS-pointer scope (broader desktop, beyond browser) |
(forward marker, planned Wave 178) | 4 |
| AD-745-2 | Multi-agent quorum on destructive-pattern URL matches (requires_consensus=True) |
(forward marker, planned Wave 178) | 4 |
| AD-745-3 | OmniParser SOM grounding for VLM-coordinate accuracy | (forward marker, planned Wave 178) | 4 |
| AD-745-4 | Pluggable grounding strategy (DOM / AX-tree / SOM / hybrid) mirroring AD-742d pattern | (forward marker, planned Wave 178) | 4 |
| AD-745-5 | Consensual profile-clone (agent acts in clone of Captain's logged-in profile for task duration) | (forward marker, planned Wave 178) | 4 |
| AD-745-6 | Multi-step action plans per DM turn (chained actions with declared plan + step-by-step ACK) | (forward marker, planned Wave 178) | 4 |
| AD-745-7 | Cross-thread action audit surface + SQLite persistence of pending actions across restart | (forward marker, planned Wave 178) | 4 |
| BF-317 | Share-screen button under-discoverable in DM thread — AD-744 button at WardRoomThreadDetail.tsx:~397 is a 14×14 stroke-SVG monitor glyph adjacent to the paperclip attach button with no visible text label; Captain didn't find it on first inspection. v1 = label + size + position differentiation; HXI Principle #1 ("system understands the human"); BF-317-1 forward marker for composer tool-palette. Backend untouched. Filed 2026-05-19 live-use triage. |
#681 | 2 |
| AD-746 | Camera + screen source policy — fuse + per-agent binding. Layer 1: new VisionAggregator debounces incoming frames (default 800 ms window) and composes ONE multimodal LLM call when both camera + screen admit within the window (Pipecat VisionAggregator pattern, BSD-2-Clause). Layer 2: extend CrewProfile.perception.bound_sources: list[Literal["camera","screen"]] (default both, back-compat) — VisionConsumer._handle early branch filters frames against active agent's bound_sources before WM fan-out / episodic anchor. Counselor binds camera, Operations binds screen → per-agent WMs become source-coherent. New POST /api/perception/sources/binding + GET /api/perception/sources (mirror AD-742c camera-binding endpoints). HXI extends CAMERA BINDINGS section into SOURCE BINDINGS in PerceptionLivePanel.tsx. AD-731 preserved (refs only); AD-541b anchor extended with sources field. New PerceptionConfig.source_fusion_enabled=True + fusion_window_ms=800 (hot-reload). Forward markers: AD-746-1 (raw priority knob preempting fusion), AD-746-2 (cross-modal salience scorer), AD-746-3 (audio as third source), AD-746-4 (browser-side fusion preview). +14 pytest, +4 vitest. Filed 2026-05-19 live-use triage. |
#682 | 2 |
| AD-746-1 | Raw priority knob — preempt fusion when one source dominates (Vapi-style interruption-sensitivity) | (forward marker, filed 2026-05-19) | 4 |
| AD-746-2 | Cross-modal salience scorer (CLIP-style: which source carries more novelty this tick) | (forward marker, filed 2026-05-19) | 4 |
| AD-746-3 | Audio as a third fused source (mic context join — pairs with AD-747) | (forward marker, filed 2026-05-19) | 4 |
| AD-746-4 | Browser-side fusion preview (Captain sees composed multimodal prompt before send) | (forward marker, filed 2026-05-19) | 4 |
| BF-318 | Mic singleton conflict — three independent acquisition paths fight over the browser audio device with no arbitration: press-to-talk (IntentSurface.tsx:2281 → speechInput.ts:startListening), wake-word continuous SR (wakeWord.ts:_startContinuousRecognition → same singleton), and VAD (voiceActivity.ts dedicated getUserMedia stream). Module-level activeRecognition in speechInput.ts means startListening() calls .abort() on whichever session is already active; race-window-dependent transcripts. Proposed fix: new ui/src/audio/speechRecognitionArbiter.ts with priority-queue leases (press_to_talk=100, conversation=75, wake_word=50); onAcquired/onPreempted callbacks let wake-word pause cleanly when press-to-talk preempts and resume on release. +4 vitest. Zero backend changes. Hard prerequisite for AD-747. Filed 2026-05-19 live-use triage. |
#683 | 2 |
| AD-747 | Always-on natural conversation mode — ConversationController (LiveKit Agents VoicePipelineAgent pattern, Apache 2.0) owns the duplex when a DM is active. Subscribes to AD-733c-5 active-agent state; when armed, acquires the BF-318 conversation mic lease (priority 75); VAD-gated STT (AD-705a / AD-733c-7) transcripts auto-submit to the active agent's DM thread via existing agent_chat keyboard path; barge-in interrupts agent TTS via voice.ts:stopSpeaking() on user speech_start during playback (ChatGPT advanced voice mode UX); end-of-conversation via 30 s silence timer releases the lease (wake-word resumes). New CognitiveConfig.conversation_mode_enabled=False (transitional gate) + conversation_silence_timeout_ms=30000 + conversation_barge_in_enabled=True (all hot-reload). HXI active-conversation LIVE indicator next to per-agent perception badge; press-to-talk button preserved but visually demoted when conversation_mode_enabled=True. Forward markers: AD-747-1 (Hume EVI prosody-gated barge-in), AD-747-2 (cross-agent handoff mid-conversation), AD-747-3 (Vapi-style interruption sensitivity knob), AD-747-4 (goodbye-phrase classifier), AD-747-5 (server-side streaming STT — pairs with AD-705a-4), AD-747-6 (multi-Captain voice profile binding), AD-747-7 (audio fused into AD-746 vision context). Depends on BF-318. +14 vitest, +6 pytest. Zero new deps. Filed 2026-05-19 live-use triage. |
#684 | 2 |
| AD-747-1 | Prosody-gated barge-in (Hume EVI pattern — reduces false interrupts from background noise) | (forward marker, filed 2026-05-19) | 4 |
| AD-747-2 | Cross-agent conversation handoff (Captain pivots Counselor→Operations mid-session) | (forward marker, filed 2026-05-19) | 4 |
| AD-747-3 | Operator-tunable interruption sensitivity (Vapi pattern: low/medium/high) | (forward marker, filed 2026-05-19) | 4 |
| AD-747-4 | Goodbye-phrase classifier as natural end-of-session signal (telephony precedent) | (forward marker, filed 2026-05-19) | 4 |
| AD-747-5 | Server-side streaming STT (pairs with AD-705a-4 streaming decode) | (forward marker, filed 2026-05-19) | 4 |
| AD-747-6 | Multi-Captain voice profile binding (mic-side identity claim) | (forward marker, filed 2026-05-19) | 4 |
| AD-747-7 | Audio fused into AD-746 vision context (per-frame describes include transcript window) | (forward marker, filed 2026-05-19) | 4 |
| AD-748 | Wave-close CI hygiene checklist — scripts/wave-close-precheck.ps1 runs 5 checks before push: (1) npm install --package-lock-only --dry-run lock-file sync, (2) vi.mock parity for heavily-mocked TS modules (audio/voice.ts, audio/wakeWord.ts, audio/speechInput.ts, store/useStore.ts, api.ts), (3) full pytest wall <22min (10min headroom under 30min CI ceiling), (4) @pytest.mark.timeout(N) audit for N < global, (5) vitest run exit code 0 (catches unhandled-promise leaks invisible to pass/fail count). Builder protocol section added to BUILDER-EXECUTION-PLAN.md. Filed 2026-05-20 after 5-BF CI-rescue arc (BF-319→BF-323 in one night). +6 pytest. Zero new deps. Zero production code. |
(filed Wave 180 close) | 2 |
| AD-748-1 | Pre-commit hook integration (.git/hooks/pre-commit runs precheck on trigger-file commits) |
(forward marker, filed 2026-05-20) | 4 |
| AD-748-2 | GHA workflow runs same script as separate fast job (advances after v1 catches real issues for 2 waves) | (forward marker, filed 2026-05-20) | 4 |
| AD-748-3 | Auto-discover heavily-mocked modules (replace static list with dynamic vi.mock-reference scan) |
(forward marker, filed 2026-05-20) | 4 |
| AD-742a | vision_fast LLM tier (per-frame supervisor describes) — SHIPPED Wave 174 (splits AD-732 vision tier; 5 new CognitiveConfig fields default None / honest-degrade; full eight-guard catalog audit (15 sites) — _LLM_TIERS extended, _TIER_ORDER unchanged (BF-269), 8 tier_config maps, is_vision_tier_configured branch, ModelRouter bypass (BF-273), vision_fast -> vision-only fallback (BF-269), health probe short-circuit, LLMResponseCache shape-based (BF-272 unchanged); 5 hardcoded tier tuples in __main__.py + commands_llm.py refactored to import _LLM_TIERS per AD-732 lesson #1; _cmd_doctor skips unconfigured optional tiers; new PerceptionConfig.vision_fast_tier; VisionConsumer._describe routes to vision_fast when configured, else vision; default model: moondream (Apache 2.0, operator-pullable); +13 pytest; 0 new deps; forward marker AD-742a-1 A/B moondream vs qwen2-vl:2b) |
#669 | 2 |
| AD-742a-1 | A/B comparison study: moondream vs qwen2-vl:2b on Captain's actual feed | (forward marker, filed Wave 174) | 4 |
| AD-742b | Face-embedding identity recognition (replace v1 LLM 'is this the Captain?' prompt) — SHIPPED Wave 174 (new src/probos/perception/identity.py with IdentityResolver + _cosine_distance; facenet-pytorch MIT pip dep, MTCNN + InceptionResnetV1 vggface2 512-d embedding; 2 new PerceptionConfig fields identity_match_threshold=0.6 + identity_resolver_enabled=True; captain_avatar_ref deprecated; VisionConsumer._resolve_subject_identity uses resolver when enrolled, else falls back to AD-733b LLM-prompt path; startup/finalize.py constructs resolver; 3 new API endpoints under require_crew_scope; reference photo discarded after enrollment; data/captain_identity.json gitignored — AD-731 invariant preserved (local artifact, not on bus); +19 pytest with _compute_embedding mocked; AD-742b-1 + AD-742b-2 forward markers filed) |
#670 | 2 |
| AD-742b-1 | Hot-reload identity_match_threshold via BF-308 setter |
(forward marker, filed Wave 174) | 4 |
| AD-742b-2 | Multi-operator enrollment / UI surface for face enrollment | (forward marker, filed Wave 174) | 4 |
| AD-742e | Vision LLM call budget telemetry surfaced in HXI status badge — SHIPPED Wave 174 (VisionConsumer in-memory per-tier counters (vision / vision_fast) with session + UTC-date reset; _record_vision_call invoked from _describe after successful LLM complete; new get_budget_snapshot() method; new GET /api/perception/budget under require_crew_scope; new <VisionBudgetBadge /> HXI component polls 5s, hidden when total_session==0 (HXI #5 progressive disclosure), amber/dim-red/bright-red color scale; mounted after Entropy span in DecisionSurface.tsx; no emoji (HXI #3); +8 pytest, +6 vitest; UI bundle index-CfjDvOSd.js; 0 new deps; AD-742e-1 SQLite-persistence + AD-742e-2 operator-configurable ceiling forward markers filed) |
#673 | 2 |
| AD-742e-1 | SQLite persistence for vision_call_log + daily roll-up across restart | (forward marker, filed Wave 174) | 4 |
| AD-742e-2 | Operator-configurable session ceiling (vs proactive_max_emissions*40 heuristic) | (forward marker, filed Wave 174) | 4 |
| AD-742f | Per-agent vision working memory persistence across restart — SHIPPED Wave 175 (new src/probos/perception/wm_store.py with WorkingMemoryStore stdlib-sqlite3 + module-level write lock; vision_observations schema, no BLOB columns per AD-731 invariant; VisionWorkingMemory.__init__ accepts optional store+agent_id kwargs and hydrates from disk + mirrors append writes; new PerceptionConfig.wm_persistence_enabled (default True, hot-reload) + matching FieldDescriptor; consumer.py factory set_working_memory_store setter + _WM_STORE global threaded through get_or_create_working_memory; startup/finalize.py constructs store BEFORE VisionConsumer so observer registration sees a wired store; honest-degrade to in-memory ring on store-unavailable; AD-541b chroma episodic_memory untouched; +10 pytest; AD-742f-1 (federation WM sync) + AD-742f-2 (TTL pruning) forward markers filed) |
#674 | 2 |
| AD-742f-1 | Cross-host federation of WM rows | (forward marker, filed Wave 175) | 4 |
| AD-742f-2 | TTL-based pruning of old WM rows | (forward marker, filed Wave 175) | 4 |
| AD-742d | Pluggable VisionSupervisor strategies — SHIPPED Wave 175 (four new SupervisorStrategy implementations: MotionStrategy per-pixel L1 diff, SceneChangeStrategy HSV histogram chi-square, NeverDescribeStrategy, AlwaysAdmitStrategy; all implement BF-308 setters uniformly; new STRATEGY_REGISTRY + build_strategy() with honest-degrade-to-aHash on unknown; new PerceptionConfig.vision_supervisor_strategy field default "ahash" + Pydantic validator; non-hot-reload FieldDescriptor since strategy swap orphans baseline state; VisionConsumer.__init__ accepts supervisor_strategy_name; startup/finalize.py threaded through; shared _load_pil_image helper; +12 pytest with real PIL JPEG fixtures; AD-742d-1 (CLIP-embedding semantic strategy) + AD-742d-2 (per-session override) forward markers filed) |
#672 | 2 |
| AD-742d-1 | CLIP-embedding semantic supervisor strategy | (forward marker, filed Wave 175) | 4 |
| AD-742d-2 | Per-session supervisor strategy override | (forward marker, filed Wave 175) | 4 |
| AD-733c-6 | Engaged-mode vision LLM call budget enforcement — SHIPPED Wave 175 (extends AD-742e telemetry into enforcement; new _maybe_enforce_budget synchronously transitions ENGAGED→AMBIENT with trigger="budget_exhausted" when total session or day calls >= cap; three new PerceptionConfig fields engaged_budget_enforcement=True + engaged_call_cap_per_session=200 + engaged_call_cap_per_day=2000, all hot-reload via BF-308; once-per-session WARNING (cleared on session change); get_budget_snapshot extended with cap_per_session+cap_per_day+enforcement_enabled+cap_reached_session+cap_reached_day (AD-742e session_ceiling_estimate retained for backcompat, now maps to cap_per_session); VisionBudgetBadge.tsx color scheme shifted to green/orange/red + dim override when enforcement_enabled=false; anti-deadlock confirmed (transition_to synchronous, safe inside _describe_lock); mode_controller.Transition.trigger docstring extended with budget_exhausted; +9 pytest +3 vitest (6 existing vitest updated); UI bundle index-DPxUl-OZ.js; AD-733c-6-1 daily-counter persistence + AD-733c-6-2 WardRoom notification forward markers filed) |
#677 | 2 |
| AD-733c-6-1 | SQLite persistence of daily aggregate across restart for engaged budget | (forward marker, filed Wave 175) | 4 |
| AD-733c-6-2 | WardRoom notification on engaged-cap-hit | (forward marker, filed Wave 175) | 4 |
| AD-743 | Adaptive conversational pacing in active 1:1 DMs — SHIPPED Wave 176 (new [FOLLOW_UP delay reason] bracket marker, ConversationPacingScheduler runtime service, two-budget rate limit (AD-728c precedent), Captain-interruption cancels pending follow-up; default-off transitional gate AvatarsConfig.pacing_enabled=False; synthesized user-turn carries from: "pacing_scheduler" for AD-541b anchor distinguishability; new pipeline step named step_4d_follow_up_parse (between AD-730-3 image-gen and BF-296 DM-outbound steps) — name adjusted from prompt's step_5_* because step_5_episodic_store already occupied that slot; v1 ships agent-initiated multi-beat only (Captain-silence deferred to AD-743-1); +12 pytest) |
#662 | 2 |
| AD-743-1 | Captain-silence "Still there?" trigger (idle-watcher loop) | (forward marker, planned Wave 176) | 4 |
| AD-743-2 | Same-tick multi-message split (delay=0 chunked rendering) | (forward marker, planned Wave 176) | 4 |
| AD-743-3 | Correction-driven per-conversation budget reset | (forward marker, planned Wave 176) | 4 |
| AD-733c-5 | Per-agent perception engagement — SHIPPED Wave 176 (promotes PerceptionModeController from singleton to per-agent instances via PerceptionEngagementRegistry; new PerceptionProfile block on CrewProfile shared with AD-742c; per-agent finalize loop runs AFTER singleton wiring; runtime.perception_mode_controller repointed to primary controller (Counselor e1 preference, else first registered); agent_chat note_dm_activity routes through registry; GET /api/perception/mode extended with per_agent dict; POST /api/perception/engage accepts optional agent (agent_id or callsign) with 404 honest-degrade; UI per-agent badges deferred to AD-733c-5-4; per-agent budget enforcement deferred to AD-733c-5-5; +11 pytest) |
#676 | 2 |
| AD-733c-5-4 | HXI per-agent perception badges (CameraLiveIndicator + usePerceptionEngagementStore + PerceptionLivePanel table) — SHIPPED Wave 177 (extends usePerceptionModeStore with perAgent: Record<string, PerceptionMode> from GET /api/perception/mode per_agent field shipped by AD-733c-5; CameraLiveIndicator per-agent compact AGENT:MOD badges when ≥ 2 entries, single-badge fallback otherwise; PerceptionLivePanel per-agent MODE table beneath preset buttons; +3 vitest) |
(forward marker, filed Wave 176) | 4 |
| AD-733c-5-5 | Per-agent vision LLM budget enforcement (thread agent_id through VisionConsumer._maybe_enforce_budget + ProactiveVisionObserver registry lookup) | (forward marker, filed Wave 176) | 4 |
| AD-733c-5-1 | HXI editor for PerceptionProfile |
(forward marker, planned Wave 176) | 4 |
| AD-733c-5-2 | Hot-reload of engagement_enabled toggle |
(forward marker, planned Wave 176) | 4 |
| AD-733c-5-3 | Federation cross-host engagement sync | (forward marker, planned Wave 176) | 4 |
| AD-733c-7 | Silero VAD secondary engagement trigger — SHIPPED Wave 176 (backend only: new note_voice_activity() hook on PerceptionModeController; VOICE_ACTIVITY_COOLDOWN_S=3.0 between PROGRAMMATIC and WAKE_WORD; new POST /api/perception/voice-activity endpoint routing through AD-733c-5 registry; default-off vad_engagement_enabled=False; vad_min_speech_duration_ms=400; new scripts/silero-vad-fetch.ps1 operator-pullable model download; +1 THIRD_PARTY_LICENSES.md entry (MIT); browser-side silero-vad.ts / voiceActivity.ts + HXI integration deferred to AD-733c-7-5; +10 pytest) |
#678 | 2 |
| AD-733c-7-5 | HXI VAD integration (browser-side silero-vad.ts + voiceActivity.ts + CameraLiveIndicator SPEECH indicator) — SHIPPED Wave 177 (lazy-loads onnxruntime-web via indirect-string-variable pattern from wakeWord.ts:268-289; new getUserMedia({audio:true}) stream — no shared mic-tap exists; 30ms / 16 kHz chunked detection with vad_min_speech_duration_ms debounce; SPEECH badge flashes 1.5s on event then dims; App.tsx lifecycle hook arms/disarms in sync with snapshot.config.perception.vad_engagement_enabled; privacy invariant: audio bytes NEVER leave browser, POST body only {agent?, source}; +5 vitest) |
(forward marker, filed Wave 176) | 4 |
| AD-733c-7-1 | Browser pause getUserMedia in DORMANT (BroadcastChannel signal) |
(forward marker, planned Wave 176) | 4 |
| AD-733c-7-2 | Multi-mic disambiguation | (forward marker, planned Wave 176) | 4 |
| AD-733c-7-3 | Speaker diarization | (forward marker, planned Wave 176) | 4 |
| AD-733c-7-4 | VAD-driven wake-word mute (CPU savings) | (forward marker, planned Wave 176) | 4 |
| AD-742c | Per-agent camera selection — SHIPPED Wave 176 (backend: wires CrewProfile.perception.camera_device_id (from AD-733c-5) into capture path; new agent_ids form field on /camera/frame threads IntentMessage.params["bound_agent_ids"]; VisionConsumer._handle early branch restricts WM fan-out + AD-541b BF-311 anchor agent_ids_json to bound set; new GET /api/perception/cameras + POST /api/perception/cameras/binding; AD-731 invariant preserved with regression source-scan; HXI multiplexer deferred to AD-742c-6; +10 pytest) |
#671 | 2 |
| AD-742c-6 | HXI camera multiplexer integration (CameraMultiplexer Zustand + useCameraStream multi-deviceId + PerceptionLivePanel CAMERA BINDINGS table) — SHIPPED Wave 177 (NEW sibling Zustand slice useCameraMultiplexerStore reading GET /api/perception/cameras + enumerateDevices; useCameraStream EXTENDED with optional deviceId kwarg + Map<deviceId, MediaStream> alongside legacy _stream + _activeDeviceId-aware _computeAgentIds(); agent_ids form field threaded when bindings present; CAMERA BINDINGS section collapsible per HXI #5; CAMS:N label on indicator when ≥ 2 devices bound; AD-731 invariant preserved — agent_ids is string list never bytes; +6 vitest) |
(forward marker, filed Wave 176) | 4 |
| AD-742c-1 | Screen capture binding per agent | (forward marker, planned Wave 176) | 4 |
| AD-742c-2 | Federation cross-host camera sync | (forward marker, planned Wave 176) | 4 |
| AD-742c-3 | IP camera RTSP ingestion | (forward marker, planned Wave 176) | 4 |
| AD-742c-4 | Audio device per-agent binding | (forward marker, planned Wave 176) | 4 |
| AD-742c-5 | Per-agent camera permissions | (forward marker, planned Wave 176) | 4 |
| AD-741-1 | Per-field hot-reload paths for safe fields (e.g. system.log_level via logging.getLogger().setLevel) — restart-required removed per field |
(forward marker, filed Wave 170) | 4 |
| AD-741-2 | Structured editors for collection-shaped fields (mcp.servers, federation.peers, etc.) — add/remove from the panel |
(forward marker, filed Wave 170) | 4 |
| AD-741-3 | YAML diff preview before APPLY — operator sees exactly what will change | (forward marker, filed Wave 170) | 4 |
| AD-741-4 | Restart-in-place modal — guided "saved + restarting now" flow that reconnects when the runtime comes back up | (forward marker, filed Wave 170) | 4 |
| AD-741-5 | Multi-Captain auth + audit log of who-changed-what (config change ledger) | (forward marker, filed Wave 170) | 4 |
| AD-741-6 | Raw YAML editor mode — editable textarea + Pydantic validate-on-save (POST /api/config/yaml) |
(forward marker, filed Wave 170) | 4 |
| AD-741-7 | Per-agent settings deep-link from Settings → Crew Roster → Agent Profile via location.hash glue |
(forward marker, filed Wave 170) | 4 |
Wave 154 — DM hardening + multimodal small wins + HXI polish:
| AD | Title | Issue | Priority |
|---|---|---|---|
| AD-719c | @-picker keyboard navigation (↑/↓ cycle, Tab confirms, scroll-into-view) — SHIPPED Wave 154 (+4 Vitest tests) | #548 | 3 |
| AD-718d-1 | Voice modulation activity indicator (ModulationIndicator SVG dim-pulse next to per-agent Speak toggle) — SHIPPED Wave 154 (+2 Vitest tests) |
#553 | 3 |
| AD-730-1-1 | WardRoomThreadDetail drag/drop + paste image — SHIPPED Wave 154 (+3 Vitest tests; #647 closed as duplicate pre-flight) | #646 | 3 |
| AD-720d-1 | Multi-image batch send + per-attachment timing in episode outcomes + multi_image_warn_threshold soft warning — SHIPPED Wave 154 (+5 pytest tests; 3 production destructure sites + 4 test destructure sites updated) |
#563 | 3 |
| AD-720e | Audio attachment playback (mpeg/mp4/ogg + existing webm/wav) — SHIPPED Wave 159 (+5 pytest + 3 Vitest tests; allow-list + magic-byte signatures + <audio controls> render in IntentSurface; WardRoom paste accepts audio chip-only; AD-731 SHA-ref invariant preserved; AD-705a forward marker for transcription; AD-720e-1/-2/-3 forward markers filed) |
#566 | 3 |
| AD-738e-2 | Refs-trailer standing rule for orphan sub-ADs — SHIPPED Wave 159 (BUILDER-EXECUTION-PLAN standing rule; DECISIONS AD-738e-1's prosody forward marker renumbered to AD-738e-2-prosody to free the AD-738e-2 slot for #653) | #653 | 3 |
| AD-725 | Targeted sub-intent dispatch on DM one-shot path — SHIPPED Wave 159 (LookupDispatcher + SubintentClassifier Protocol + regex v1 ladder; DmTargetedLookupConfig default OFF; 4 firewall contracts: one-lookup-per-turn, read-only, hard timeout, no intent_bus; prepends recall block into message_text in agent_chat; AD-725-1/-2/-3/-4/-5/-6 forward markers filed) |
#583 | 3 |
| AD-724-1 | DM sanity gate one-shot retry on rejection — SHIPPED Wave 154 (+ shared with -2/-5 boundary tests) | #627 | 3 |
| AD-724-2 | DM repetition similarity beyond exact-prefix (stdlib difflib.SequenceMatcher) — SHIPPED Wave 154 |
#628 | 3 |
| AD-724-5 | DM sanity gate lifted into WR/chain reply paths via shared apply_dm_sanity helper — SHIPPED Wave 154 (+12 pytest tests across -1/-2/-5) |
#629 | 3 |
Forcing-function deferrals¶
(Forward dependencies satisfied; build when need surfaces.)
| AD | Forcing function |
|---|---|
| AD-574c-ii | DM conversation convergence (full ProfileChatTab refactor — substrate ready) |
| AD-641g-1-1 | Flip SubTaskExecutor to await ANALYZE results from NATS subjects |
Commercial-tagged items¶
(Live in private commercial repo; OSS surface is the extension point only.)
| AD | Title |
|---|---|
| AD-693 | Federation Knowledge Sync (cross-instance edge synchronization) |
| AD-694 | Kùzu Migration (graph-DB upgrade path for knowledge_edges) |
Bug Tracker¶
OSS bugs are tracked as GitHub issues with the bug label:
https://github.com/seangalliher/ProbOS/issues?q=is%3Aopen+label%3Abug.
Closed bug-fix history (BF-001 through BF-247) is preserved in roadmap-era-5-completed.md.
Era History¶
- Era I — Genesis: bootstrap, agent registry, intent bus, episodic memory (progress-era-1-genesis.md)
- Era II — Emergence: Hebbian routing, dreaming, self-modification, federation transport (progress-era-2-emergence.md)
- Era III — Product: HXI, Ward Room, Captain identity, alert conditions (progress-era-3-product.md)
- Era IV — Evolution: ontology, billet registry, qualification programs, naval discipline (progress-era-4-evolution.md)
- Era V — Unification: Oracle absorption, knowledge graph, brain-enhancement (AD-641 family), commercial-overlay seam (progress-era-5-unification.md)