Policy OS Applied to Development Infrastructure
Applying the Three-Tier Framework and Policy OS to the development workflow itself, demonstrating that agent governance emerges as a structural property at every scale.
Abstract
Policy OS — CREATE SOMETHING's governed execution platform — was designed for client MCP deployments. This paper documents applying the same product to our own development workflow via the Pi coding agent harness, demonstrating that agent governance is not an add-on but a structural property that emerges from the Three-Tier Framework at every scale.
1. Introduction
Agent coding harnesses — Pi, Claude Code, Codex, Cursor — share a fundamental problem: they are general-purpose tools operating in domain-specific environments. An agent writing SvelteKit components needs to know about Canon design tokens. An agent deploying MCP servers needs to know about the fleet registry. An agent closing a Linear issue needs to know about the evidence contract.
This knowledge traditionally lives in documentation that agents may or may not read. The Policy OS approach makes it structural: quality gates enforce compliance automatically, custom tools make verification easy, and domain skills make knowledge loadable on demand.
2. The Three-Tier Mapping
The development harness maps cleanly to the Three-Tier Framework:
| Tier | Framework Role | Development Implementation |
|---|---|---|
| Database | What exists | Git state, package exports, fleet registry, Canon tokens |
| Automation | What happens | Quality gates, custom tools, interactive commands |
| Judgment | What should happen | Bash guard, pre-completion checks, evidence requirements |
Control Models Verified
- Application-controlled (Database): Session context injected via
before_agent_start— the extension decides what state the agent sees. - Model-controlled (Automation): Custom tools like
context7_queryare available but the agent decides when to call them. - User-controlled (Judgment): Skills loaded via
/skill:name— explicit selection of guidance.
3. Implementation
Scale
| Component | Count | Role |
|---|---|---|
| Event handlers | 8 | Quality gates, context injection, bash guard, lifecycle |
| Custom tools | 3 | Context7 bridge (×2), package export verifier |
| Commands | 8 | Linear workflow, testing, fleet ops, Canon audit, pre-commit |
| Prompt templates | 8 | Deploy, audit, review, research, experiment, paper, MCP scaffold |
| Skills | 21 | 3 native + 18 cross-loaded — domain knowledge on demand |
| Theme | 51 colors | Glass Design System alignment |
| Total extension | 1,181 lines | Single coherent extension file |
Quality Gate Architecture
Write/Edit
│
├─► tool_result handler
│ ├── Canon token compliance (6 pattern checks)
│ ├── Import verification (@create-something/* packages)
│ ├── Paper structure (SEO, container, classes)
│ └── Experiment structure (SEO, Canon tokens, hex colors)
│
└── Violations? → Append to tool result → Agent self-corrects
Bash execution
│
└─► tool_call handler
├── Block legacy loom commands → redirect to Linear
└── Enforce [CRE-NNN] in commit messages
Agent completion
│
└─► agent_end handler
├── TypeScript type check (modified packages)
├── ESLint lint check (modified packages)
├── Uncommitted changes reminder
└── Issues? → sendUserMessage(followUp) → Agent fixes4. The Recursive Property in Practice
The extension exhibits the Three-Tier Framework's recursive property:
tool_result(Automation) checks Canon compliance (Judgment) and feeds violations back- The agent (Automation) reads the violations and self-corrects (more Automation)
agent_end(Automation) runs typecheck (Database verification) and reports- If issues exist,
sendUserMessagere-enters the agent loop — Automation invoking more Automation with embedded Judgment
This is the sampling feedback loop described in the framework paper, realized in a development harness.
5. The Product Insight
What we built for the development workflow is structurally identical to what we sell as Policy OS:
| Policy OS Deliverable | Development Implementation |
|---|---|
mcp_contract.yaml | .pi/settings.json + cross-loaded skills |
agent_contract.yaml | Bash guard rules + quality gate event handlers |
outcome_contract.md | APPEND_SYSTEM.md + prompt templates |
golden_tasks.yaml | Pre-commit checks + typecheck/lint on completion |
runbook.md | Interactive commands (/linear, /fleet, /deploy) |
The harness IS the policy. The configuration IS the contract. The development workflow IS the first client.
6. Distribution as Discovery
The Pi package ecosystem enables a new funnel:
pi install npm:@create-something/pi-three-tier-framework
→ Developer learns Database/Automation/Judgment
→ Classifies their own systems
pi install npm:@create-something/pi-policy-os
→ Developer runs /policy-check
→ Sees governance score and gaps
→ Contacts createsomething.agency This mirrors the MCP-First Thesis: the entry point is connectivity (installable agent configuration), not intelligence (full consulting engagement).
7. Conclusion
Policy OS is not a product category — it is a consequence of the Three-Tier Framework applied to any agent-governed workflow. When you configure quality gates, you are building Database checks. When you register custom tools, you are building Automation. When you write domain skills and prompt templates, you are building Judgment artifacts.
The creation moat — understanding what to build, not just how to install it — applies to agent harness configuration just as it applies to MCP server creation. Both require domain expertise combined with protocol knowledge. Both are hard to commoditize.