PAPER-2026-012

Policy OS Applied to Development Infrastructure

Applying the Three-Tier Framework and Policy OS to the development workflow itself, demonstrating that agent governance emerges as a structural property at every scale.

May 2026 CREATE SOMETHING createsomething.io

Abstract

Policy OS — CREATE SOMETHING's governed execution platform — was designed for client MCP deployments. This paper documents applying the same product to our own development workflow via the Pi coding agent harness, demonstrating that agent governance is not an add-on but a structural property that emerges from the Three-Tier Framework at every scale.

1. Introduction

Agent coding harnesses — Pi, Claude Code, Codex, Cursor — share a fundamental problem: they are general-purpose tools operating in domain-specific environments. An agent writing SvelteKit components needs to know about Canon design tokens. An agent deploying MCP servers needs to know about the fleet registry. An agent closing a Linear issue needs to know about the evidence contract.

This knowledge traditionally lives in documentation that agents may or may not read. The Policy OS approach makes it structural: quality gates enforce compliance automatically, custom tools make verification easy, and domain skills make knowledge loadable on demand.

2. The Three-Tier Mapping

The development harness maps cleanly to the Three-Tier Framework:

TierFramework RoleDevelopment Implementation
DatabaseWhat existsGit state, package exports, fleet registry, Canon tokens
AutomationWhat happensQuality gates, custom tools, interactive commands
JudgmentWhat should happenBash guard, pre-completion checks, evidence requirements

Control Models Verified

  • Application-controlled (Database): Session context injected via before_agent_start — the extension decides what state the agent sees.
  • Model-controlled (Automation): Custom tools like context7_query are available but the agent decides when to call them.
  • User-controlled (Judgment): Skills loaded via /skill:name — explicit selection of guidance.

3. Implementation

Scale

ComponentCountRole
Event handlers8Quality gates, context injection, bash guard, lifecycle
Custom tools3Context7 bridge (×2), package export verifier
Commands8Linear workflow, testing, fleet ops, Canon audit, pre-commit
Prompt templates8Deploy, audit, review, research, experiment, paper, MCP scaffold
Skills213 native + 18 cross-loaded — domain knowledge on demand
Theme51 colorsGlass Design System alignment
Total extension1,181 linesSingle coherent extension file

Quality Gate Architecture

Write/Edit
    │
    ├─► tool_result handler
    │     ├── Canon token compliance (6 pattern checks)
    │     ├── Import verification (@create-something/* packages)
    │     ├── Paper structure (SEO, container, classes)
    │     └── Experiment structure (SEO, Canon tokens, hex colors)
    │
    └── Violations? → Append to tool result → Agent self-corrects

Bash execution
    │
    └─► tool_call handler
          ├── Block legacy loom commands → redirect to Linear
          └── Enforce [CRE-NNN] in commit messages

Agent completion
    │
    └─► agent_end handler
          ├── TypeScript type check (modified packages)
          ├── ESLint lint check (modified packages)
          ├── Uncommitted changes reminder
          └── Issues? → sendUserMessage(followUp) → Agent fixes

4. The Recursive Property in Practice

The extension exhibits the Three-Tier Framework's recursive property:

  1. tool_result (Automation) checks Canon compliance (Judgment) and feeds violations back
  2. The agent (Automation) reads the violations and self-corrects (more Automation)
  3. agent_end (Automation) runs typecheck (Database verification) and reports
  4. If issues exist, sendUserMessage re-enters the agent loop — Automation invoking more Automation with embedded Judgment

This is the sampling feedback loop described in the framework paper, realized in a development harness.

5. The Product Insight

What we built for the development workflow is structurally identical to what we sell as Policy OS:

Policy OS DeliverableDevelopment Implementation
mcp_contract.yaml.pi/settings.json + cross-loaded skills
agent_contract.yamlBash guard rules + quality gate event handlers
outcome_contract.mdAPPEND_SYSTEM.md + prompt templates
golden_tasks.yamlPre-commit checks + typecheck/lint on completion
runbook.mdInteractive commands (/linear, /fleet, /deploy)

The harness IS the policy. The configuration IS the contract. The development workflow IS the first client.

6. Distribution as Discovery

The Pi package ecosystem enables a new funnel:

pi install npm:@create-something/pi-three-tier-framework
    → Developer learns Database/Automation/Judgment
    → Classifies their own systems

pi install npm:@create-something/pi-policy-os
    → Developer runs /policy-check
    → Sees governance score and gaps
    → Contacts createsomething.agency

This mirrors the MCP-First Thesis: the entry point is connectivity (installable agent configuration), not intelligence (full consulting engagement).

7. Conclusion

Policy OS is not a product category — it is a consequence of the Three-Tier Framework applied to any agent-governed workflow. When you configure quality gates, you are building Database checks. When you register custom tools, you are building Automation. When you write domain skills and prompt templates, you are building Judgment artifacts.

The creation moat — understanding what to build, not just how to install it — applies to agent harness configuration just as it applies to MCP server creation. Both require domain expertise combined with protocol knowledge. Both are hard to commoditize.