Contributing¶
ProbOS is open source under the Apache License 2.0. Contributions are welcome!
Getting Started¶
# Clone and install
git clone https://github.com/seangalliher/ProbOS.git
cd ProbOS
uv sync
# Run the test suite
uv run pytest tests/ -v
Development Workflow¶
- Fork the repository
- Create a feature branch (
git checkout -b feature/my-feature) - Make your changes
- Run the test suite — all tests must pass
- Submit a pull request
Code Style¶
- Python 3.12+ with type annotations
- Async-first — most interfaces are
async - Pydantic for configuration and data models
pytest+pytest-asynciofor testingencoding="utf-8"on allopen()callsasyncio.create_task()overasyncio.ensure_future()
Engineering Principles¶
All contributions must adhere to the ProbOS Principles Stack. Pull requests that introduce violations will be flagged during review.
Structure — SOLID¶
| Principle | Rule | ProbOS Example |
|---|---|---|
| Single Responsibility | One reason to change per class. No god objects. | New services get their own module — don't add methods to runtime.py. |
| Open/Closed | Extend via public APIs, not private member patching. | Never obj._private_attr = value. Define a public setter or constructor parameter. |
| Liskov Substitution | Subtypes must honor base contracts. | Any CognitiveAgent subclass must work wherever the base is expected. |
| Interface Segregation | Depend on narrow typing.Protocol interfaces, not entire classes. |
An agent needing episodic memory depends on EpisodicMemoryProtocol, not all of ProbOSRuntime. |
| Dependency Inversion | Depend on abstractions, inject via constructor. | Services receive dependencies at construction — never reach into a runtime god object. |
Communication — Law of Demeter¶
A method should only call methods on: (a) itself, (b) its parameters, (c) objects it creates, (d) its direct dependencies. Never chain through objects: self.thing._internal_thing.do_stuff().
If two services need to be wired together, define a public API on the target service.
Reliability — Fail Fast¶
Errors should be detected and reported as close to their origin as possible. Three tiers:
| Tier | When | Pattern |
|---|---|---|
| Swallow | Truly non-critical: shutdown cleanup, telemetry, rebuildable indexes | except Exception: pass |
| Log-and-degrade | System continues but capability is reduced | except Exception: logger.warning("...", exc_info=True) |
| Propagate | Caller must know: security boundaries, data integrity | raise or re-raise |
Default to log-and-degrade. Every except Exception: pass must be justified.
Security — Defense in Depth¶
- Validate at every boundary, not just the edge
- Input sanitization at the API layer and the service layer
- Database constraints enforced by the engine (
PRAGMA foreign_keys = ON), not just application code - File path operations must sanitize against traversal (no
../escape from data directories) - Never assume the caller already checked
Efficiency — DRY¶
- Search for existing implementations before writing new ones
- If the same logic exists in 2+ places, extract to a shared utility (
src/probos/utils/) - This applies to patterns too — if 6 SQLite modules do the same migration dance, that's a shared helper
Cloud-Ready Storage¶
New database modules must use an abstract connection interface rather than calling aiosqlite.connect() directly. This enables the commercial overlay to swap storage backends (SQLite → Postgres) without modifying business logic.
- OSS: SQLite implementation (embedded, zero config, single-ship)
- Commercial: Managed database services for multi-tenant cloud deployment
This principle ensures the OSS core remains deployable as a standalone application while supporting cloud-native scaling in the commercial product.
Testing Standards¶
- Framework: pytest + pytest-asyncio. Prefer
_Fake*stub classes over complex mock chains. - Coverage: All new public methods and branches must have tests. Target 100% coverage on new code.
- Test structure: Arrange-Act-Assert. One behavior per test. Name tests:
test_{method}_{scenario}_{expected}. - Boundary testing: Every public method needs at minimum: happy path, error/edge case, and empty/None input where applicable.
- Isolation: Tests must not depend on execution order. No shared mutable state. Each test creates its own fixtures.
- Cleanup: Tests must clean up resources (temp files, tasks, DB entries). Use
tmp_path,try/finally, or context managers. - API endpoints: Minimum 3 tests per endpoint — happy path, error case, input validation.
- UI changes: Every TypeScript/React change requires a Vitest component test.
- Mock discipline: Always use
MagicMock(spec=RealClass)orAsyncMock(spec=RealClass)when mocking system objects (runtime, services, registries). Unspec'd mocks silently invent attributes on access — a test passes even when the code references attributes that don't exist on the real object. This is the #1 cause of refactoring bugs surviving the test suite. Shared mock factories (e.g.,_make_mock_runtime()) must usespec=on the top-level object.
Type Annotation Standards¶
- All public methods and properties must have full type annotations (parameters + return type).
- When a class implements a
typing.Protocol, its method signatures must match exactly. - Use modern syntax:
X | NoneoverOptional[X],list[str]overList[str](Python 3.10+). - Internal
_privatemethods: annotations recommended but not required.
Logging Standards¶
Every log message must include what failed, why it matters, and what happens next.
# Bad
logger.warning("error")
# Good
logger.warning("Template %s not found in spawner; falling back to default", template_name)
| Level | When |
|---|---|
debug |
Internal state useful during development only |
info |
System lifecycle events (startup, shutdown, agent spawned) |
warning |
Degraded operation — failed but compensated |
error |
Operation failed, user impact, requires investigation |
exception |
Same as error but with traceback — use inside except blocks |
No bare print() for operational output — use logger. No sensitive data (API keys, tokens) in logs.
Exception handler logging must be warning or higher — never debug. A caught exception that degrades functionality is operationally significant. Using debug makes failures invisible in production and can hide bugs for months (see BF-078).
Async Discipline¶
- Use
asyncio.get_running_loop(), neverget_event_loop(). - Use
asyncio.create_task(), neverasyncio.ensure_future(). - Always hold a reference to created tasks — fire-and-forget silently swallows exceptions.
- Long-running async methods must catch
asyncio.CancelledError, clean up, and re-raise. - Use
async withfor resources that need async teardown.start()impliesstop().
Import & Module Standards¶
- Lower layers must not import from higher layers.
- Use
TYPE_CHECKINGguard for type-only imports that would create cycles. - Import order: stdlib → third-party → local, separated by blank lines.
- Never use
from module import *.
Configuration Standards¶
- New config must use Pydantic models in
config.py. No raw dicts or ad-hoc env var parsing. - Every config field must have a sensible default (zero-config startup).
- Use Pydantic validators — invalid config should fail at startup, not at runtime.
Refactoring Safety¶
When renaming, moving, or removing attributes during decomposition:
- Search all access patterns, not just
self._attr. When movingself._agentsfrom runtime to registry, search\._agents\bto catchrt._agents,runtime._agents,self._runtime._agents, etc. Aself._attrgrep misses external callers using different variable names. - Run affected tests with
spec=mocks before and after the refactoring. If a test uses unspec'dMagicMock(), it will pass regardless of whether the attribute exists. - Check exception handlers along the affected code paths. Swallowed exceptions can hide broken references indefinitely. Grep for
except.*Exceptionnear the changed code.
Architecture Guidelines¶
- Three capability tiers: Agents, Tools, Skills. Agents are the unit of behavior (crew members who think and decide). Tools are the unit of action (instruments like tricorders — typed callables shared across agents). Skills are the unit of knowledge (data access attached to agents). Rule of thumb: if someone would ask for it, it's an agent. If it performs a specific action any agent might need, it's a tool. If an agent needs reference data to do its job, it's a skill.
- Self-describing agents. Every agent declares
IntentDescriptormetadata so the system discovers it automatically. - Consensus for side effects. Any operation that modifies external state must go through the consensus layer.
- Test everything. Each layer has comprehensive tests. New code should maintain coverage.
Dependencies¶
| Package | Purpose |
|---|---|
| pydantic >=2.0 | Configuration validation |
| pyyaml >=6.0 | YAML config loading |
| aiosqlite >=0.19 | Async SQLite |
| rich >=13.0 | Terminal UI |
| httpx >=0.27 | HTTP client |
| pyzmq >=27.1 | ZeroMQ transport |
| chromadb >=1.0 | Vector database |
| fastapi >=0.115 | API server |
| uvicorn >=0.34 | ASGI server |
Dev: pytest >=8.0, pytest-asyncio >=0.23, vitest (UI)
Windows Development¶
ProbOS is developed primarily on Windows. A few environment setup steps are needed:
Prerequisites¶
- Python 3.12+ via uv (recommended) or standalone install
- Git for Windows — ensure
gitis on your system PATH (where gitshould resolve) - Node.js 18+ — for the HXI/UI layer (
ui/)
Shell Setup¶
uv installs to ~/.local/bin which isn't on the default PATH for either shell. Add it to your profile:
PowerShell (recommended) — create/edit $PROFILE:
Git Bash — create/edit ~/.bashrc:
Open a new terminal after saving for changes to take effect.
Known Platform Notes¶
| Issue | Cause | Workaround |
|---|---|---|
uv: command not found in bash |
uv installs to ~/.local/bin which isn't on Git Bash's default PATH |
Add to ~/.bashrc (see above) |
asyncio.to_thread() hangs during shutdown |
Windows SelectorEventLoop (required for pyzmq) has limited threading support |
BF-011/BF-012: async polling replaces to_thread in shutdown paths |
echo not found in subprocess tests |
echo is a CMD builtin, not an executable on Windows |
Tests mock subprocess calls (AD-404) |
pip not found |
uv manages its own Python; system pip may not exist | Use uv run pip or .venv/Scripts/pip3.exe |
Event Loop¶
ProbOS uses WindowsSelectorEventLoopPolicy on Windows because pyzmq's add_reader() requires it (ProactorEventLoop doesn't support this). This means asyncio.subprocess and asyncio.to_thread() have limitations. Production code should use threading.Thread + asyncio.sleep() polling for subprocess operations that may block.