ADR: Agent Factory Orchestration¶
Status¶
Accepted
Date¶
2026-05-07
Context¶
ADR 2026-03-04 (Agentic Development Pipeline) framed the problem and surveyed the landscape, but proposed a custom dispatcher built on Cloud Workflows + a single shared GCE spot VM running a job-switch shell script. Since then:
- OpenAI released Symphony (April 2026), an open spec that solves the orchestration problem we were going to hand-roll: poll an issue tracker, isolate per-issue workspaces, spawn coding agents, retry with backoff. OpenAI reports a 6× increase in merged PRs from internal teams using it.
- The reviewer-experience gap became clearer. The original ADR focused on agent → PR. It didn't address how a human reviewer evaluates the PR. Without a stable preview URL pointing at an isolated stack (api + backoffice + widget + cactus + DB), reviewing an agent-authored PR means checking out the branch and running everything locally — friction that defeats the velocity win.
- Supabase Branching 2.0 went GA, making per-PR Postgres isolation native via the GitHub integration with zero glue code on our side.
- Cost-and-isolation tradeoffs around Neo4j / Mongo / LightRAG crystallized: per-env isolation (separate
test/staging/productioninstances) already exists, and per-PR provisioning of these stores would multiply cost and warmup latency without a clear safety win.
These four shifts are large enough that the original ADR's specific architecture (custom dispatcher + ephemeral DB only + no preview URL) is no longer the right shape. The high-level intent — "Linear ticket → autonomous agent → reviewable PR" — stands and is carried forward.
Decision¶
Two concerns, deliberately decoupled:
The factory is where the agent runs. The preview is what the agent's PR deploys to. They have separate lifecycles, separate failure modes, and separate roadmaps. Either can be built or replaced without touching the other.
1. Agent runtime (Symphony in a local git worktree)¶
Use the OpenAI Symphony spec directly, Elixir reference implementation, run locally on a dev machine or single VM. No container, no Cloud Workflows, no GCE pool — those are future extensions, not Phase 1.
- Symphony as orchestrator. Symphony claims Linear issues in active states and dispatches one Claude Code subprocess per issue. Concurrency cap: 5 parallel issues in flight on the dev VM.
- Workspace = git worktree.
after_createhook runsgit worktree add ../<issue-id>offdevelop, then./bootstrap.py --skip-env. Worktrees give us full isolation between concurrent agents without any container machinery. - Eager Supabase branch. Same
after_createhook calls Supabase Mgmt API to create a branch namedix-<issue-id>, runsscripts/seed_local_db.py --branch ix-<issue-id> --max-domains 4 --since 2026-04-01to populate it, and writes the branch-scoped Supabase keys into the worktree's.env. The agent has DB access from the start and can run migrations / inspect data while developing. Note: Supabase branches are schema-only — a fresh branch is an isolated Postgres instance with our schema, edge functions, and extensions but zero rows (unlike Neon / Xata copy-on-write data branches). Theseed_local_db.pystep is what makes the branch useful, not optional polish. The marginal cost is small because the script already exists for local dev (just seed); we just call it with--branch. - Symphony owns the branch lifecycle. We disable Supabase's GitHub-integration auto-branch on PR open; instead, when the PR opens we point Cloud Build at the existing
ix-<issue-id>branch. One branch per Linear issue, single owner, deleted bybefore_remove. Avoids the dual-lifecycle reconciliation problem. before_removecleans up.git worktree remove+ Supabase Mgmt API branch delete. Fast, deterministic, no orphans.- Coding agent is Claude Code CLI with a Max subscription token. No metered LLM cost — circuit breakers around dollars are irrelevant; we keep duration and consecutive-error breakers only.
- Auth via
ANTHROPIC_AUTH_TOKENfrom GCP Secret Manager (scripts/secrets/refresh-claude-token.sh). One secret, refreshed on token expiry. - Tool safety via
.claude/settings.jsonin the worktree:--allowedToolswhitelist (e.g."Read,Write,Edit,Glob,Grep,Bash(git *),Bash(just *)") and a PreToolUse hook blockingrm -rf /, docker-socket, and curl-pipe-sh patterns. Hooks fire even with--dangerously-skip-permissions, which is our headless requirement. - Long-running session discipline: let Claude Code's compaction work; checkpoint-commit per sub-task;
--resumefor continuation;CLAUDE.mdsurvives compaction. - Rollback discipline: the agent only pushes to its own feature branch and opens a PR with
gh pr create --base develop. Never todevelop/main. The PR is the only merge path.
Registering as a first-class Linear Agent¶
We register Symphony as a Linear Agent (not just an API consumer), so it shows up as a workspace participant — assignable, @-mentionable, and visible in Linear's UI with structured status updates rather than as comments from a service user.
Concretely this requires:
- A Linear OAuth Application with webhooks enabled, including the Agent session events category.
- OAuth install with
actor=app— creates a dedicated agent user (no billable seat) bound to the app, distinct from any human user. This is whatLINEAR_API_KEYreads/writes attribute to. - Scopes:
app:assignable(Symphony can be set as the issue assignee) andapp:mentionable(Symphony can be @-mentioned in issues, documents, comments). The standardread/writescopes for the data we already use (issues, projects, comments). AgentSessionwebhook receiver — a small Cloud Function endpoint that handles three event types: assigned, mentioned, follow-up prompt. Lives alongside the existing Linear webhook Cloud Function (reuses HMAC verification, secrets wiring, and deploy pipeline).- Pub/Sub bridge to local Symphony. The Cloud Function does not call Symphony directly. It publishes the verified event to a Pub/Sub topic (
linear-agent-events); Symphony — running locally on a dev VM — maintains a long-lived gRPC pull subscription. Outbound connection only, no inbound port to expose, NAT-friendly. Messages queue in Pub/Sub while the dev VM is offline; Symphony catches up on startup. This complements (not replaces) Symphony's poll loop, which remains the source of truth for state reconciliation. The same Pub/Sub bridge works unchanged when Symphony moves to a cloud host in a future phase. - Agent Activities API emission. Symphony's
before_runandafter_runhooks post structured activities back to Linear: thoughtactivity within 10 seconds of receiving a delegation (required by spec — Linear shows "agent is thinking…" in the UI)- status updates as the agent progresses (started, opened PR, blocked)
- final activity on completion or failure
- Identity hygiene. All git commits, PRs, and Linear comments authored by Symphony's agent user. Reviewers can filter by agent author in Linear and GitHub.
Net-new GCP infra for this layer: one HTTP Cloud Function (~50 lines) + one Pub/Sub topic + subscription. No new repo — both ride the existing Linear-webhooks deployment.
2. Deployment target (per-PR preview environment)¶
Every PR — agent-authored or human-authored — gets a self-contained preview, deployed on push and torn down on close. Reviewer-facing URL: pr-<N>.preview.userose.ai.
| Layer | Strategy | Per-PR isolation |
|---|---|---|
| Postgres (Supabase) | Symphony-managed eager branch ix-<issue-id>, populated via scripts/seed_local_db.py --branch … (existing tooling, ~4 prod domains, FK-aware) |
Full |
Cloud Run (api, backoffice) |
Per-PR no-traffic tagged revision (pr-<N> tag), --cpu-boost, --max-revisions=20 |
Full |
Frontend widget bundle |
GCS prefix gs://rose-pr-previews/<N>/widget/ |
Full |
Frontend preprod-ui (playground) |
GCS prefix gs://rose-pr-previews/<N>/preprod-ui/ |
Full |
Frontend client-backoffice |
GCS prefix gs://rose-pr-previews/<N>/backoffice/ (or served directly from Cloud Run revision, see note) |
Full |
| Cactus / static pages | Cloudflare Pages alias pr-<N> |
Full |
| Routing | Cloudflare Worker route pr-<N>.preview.userose.ai mapping /api/*, /backoffice/*, /playground/*, /widget/*, /* to the appropriate target |
Full |
| Neo4j | Shared test instance, read-only when PR_TENANT_ID set |
Read-only |
| MongoDB | Shared test instance, read-only when PR_TENANT_ID set |
Read-only |
| LightRAG working dir | Shared test GCS path, read-only |
Read-only |
| Redis | Shared test, key prefix pr-<N>: if writes needed |
Namespace |
Triggered by Cloud Build PR triggers (open + sync + close). Native Cloud Run deployment previews pattern, no third-party platform.
Stale-tag GC. Two layers protect us from accumulating stale pr-<N> tags if a teardown trigger ever fails:
--max-revisions=20per Cloud Run service auto-prunes oldest non-traffic revisions.- A daily Cloud Scheduler job lists all
pr-*tags, queries GitHub for PR state, and removes tags whose PR is closed/merged.
3. Safety boundaries¶
- Production Supabase keys never reach agent shells or PR-preview Cloud Run revisions. Symphony's
after_createhook creates the branch via Mgmt API and writes only the branch-scoped keys into.env. PerMEMORY.md, allIX_ENVIRONMENTvalues share one Supabase project — branching is the only true isolation. This is the critical safety boundary, not the seed-data shape. - Neo4j / Mongo / LightRAG are read-only for agent runs and PR previews. The agent can chat, retrieve, classify; it cannot ingest. Eliminates cross-PR collisions and protects shared
testdata. - Branch seeding uses real production data scoped to ~4 test domains.
scripts/seed_local_db.pyalready exists and handles FK ordering, domain filtering, and--sincetrimming. Same risk model as the existing local-dev seeding (developers already pull this data viajust seed). Agents need realistic shapes to validate retrieval / classification end-to-end; synthetic fixtures aren't enough for our stack. Data is not masked; the boundary is branch isolation, not content masking. tracking_statusand other curated client config stay human-approval-gated per existing memory rules.
4. Out of scope (future extensions)¶
These are explicitly not part of this ADR. They were proposed in the prior ADR; carrying them in here would conflate scaling decisions with the core "Symphony + previews" decision. Each gets its own ADR if and when needed.
- Containerized factory (Docker hardening, non-root user, network allow-list, resource limits,
read_only: true+tmpfs). Only relevant when we promote Symphony off the dev machine. - GCE spot VM pool / Cloud Workflows orchestration for parallel concurrent agents at scale.
- Multi-job dispatch beyond
rose-solve-linear-ticket.rose-onboard-client(manual + KB-webhook) androse-solve-chatbot-issues(daily cron) keep their existing direct triggers; they are not Symphony-driven. - Self-review loop (Sonnet implements / Opus scores). Useful but additive — the PR review process already gates merges. Adopt later if agent quality demands it.
- Slack heartbeat / observability beyond Langfuse + Cloud Logging. Symphony's stdout + existing Langfuse traces cover Phase 1.
- Worktree promotion to ephemeral cloud workspaces (Cloud Workstations, E2B, etc.).
Consequences¶
Positive¶
- Two small things instead of one big thing. Factory and preview deploy are independent — either can be replaced without disturbing the other. Symphony could be swapped for a different orchestrator; previews could move to Northflank; neither change cascades.
- Phase 1 is tiny. Symphony local + git worktree + four hook scripts + Cloud Build PR triggers + Cloudflare Worker route. No container, no VM pool, no Cloud Workflows.
- Reviewer experience is first-class. A stable preview URL per PR lets humans evaluate agent output in seconds, not minutes.
- No platform migration. Stays on GCP. Existing per-env isolation for Neo4j/Mongo/LightRAG/Redis, Secret Manager, and Cloud Run are all reused as-is.
- Subscription auth removes cost-management complexity. No
MAX_COST_USDbreaker, no per-run cost tracking, no token-budget tuning. - Safety by construction. Symphony's workspace path validation + Supabase branching + read-only RAG stores layer correctly. Hardest failure mode (agent writes to prod customer data) is structurally prevented.
Negative¶
- Read-only RAG stores limit agent scope. Agents can't end-to-end test ingestion-touching features without falling back to staging or a manual run.
- Symphony is Elixir. New runtime, but Phase 1 keeps it as a single local process; not yet a deployment concern.
- Cloud Build YAML + Worker route are net-new infra surface. Small but real to maintain.
- No container isolation in Phase 1. Worktrees give path isolation, not process or filesystem isolation. A misbehaving agent could in principle affect the host. Mitigations: PreToolUse hook blocking destructive Bash patterns,
--allowedToolswhitelist, run on a dedicated dev VM rather than a developer laptop.
Neutral¶
- Local-first means concurrency is bounded by the host. Fine for single-digit parallel issues; revisit when Symphony pulls more.
- Cron and manual triggers stay where they are. The chatbot-triage and onboarding jobs do not move.
Alternatives Considered¶
1. Keep the original ADR's custom dispatcher¶
Rejected. Symphony covers the dispatcher problem with a maintained spec and reference impl. Building our own gives us more code to maintain with no behavioral advantage. The original ADR's value (job patterns, harness hardening, observability) is preserved by carrying those concerns forward into Symphony's hook scripts and the agent's CLI invocation.
2. Migrate to Railway¶
Rejected. Railway gives us push-to-redeploy + DB branching out of the box, but we lose our existing per-env Neo4j/Mongo/LightRAG/Redis isolation, Secret Manager-native auth flows, and the Cloud Run + Cloudflare Worker routing already in place. Marginal ergonomic gain; large migration cost; ongoing dependency on a smaller provider. The roughly equivalent GCP setup is ~50 lines of Cloud Build YAML + 1 Worker route.
3. Per-PR Neo4j / Mongo / LightRAG instances¶
Rejected for Phase 1. Provisioning cost (Neo4j AuraDB tier minimums, Mongo Atlas warmup) and operational complexity outweigh the benefit when most agent tasks don't touch ingestion. Phase 2 may revisit with tenantId-namespaced writes against the shared test stores.
4. Synthetic-only seed.sql fixtures¶
Rejected. We considered hand-rolling synthetic fixtures (base.sql, auth-users.sql, content-populated.sql) instead of cloning prod. Two problems: (a) maintenance — every schema change drifts the fixtures, (b) realism — agents validating retrieval / classification need real-shaped data with realistic FK distributions; synthetic data exposes false positives. Since scripts/seed_local_db.py --branch already exists with domain filtering and FK ordering, the real-data path is cheaper to operate and more useful.
5. Masked production data clone (Snaplet / Postgres.ai)¶
Deferred. Adds a paid dependency, masking-correctness audit burden, and weekly pipeline maintenance. The branch isolation we already have is a strong-enough boundary for Phase 1. Revisit if we ever expose preview URLs beyond internal reviewers.
6. Linear API-key-only (no Agent registration)¶
Rejected. Using a plain LINEAR_API_KEY against a service-user account works for polling and writing comments, but Symphony's actions appear as comments from "some user" rather than as a recognized agent. We lose the assignable/mentionable affordances, the agent-session UI, and the structured activity feed (thought, status updates) that reviewers see directly in Linear. The OAuth actor=app install + Agents SDK is ~50 lines of webhook receiver more, and gives us the right identity model for free.
Build-time strategy¶
Cloud Build + Dockerfile is slow by default (~15–30 min per PR). Three decisions keep PR previews under 4 min without migrating off GCP:
- Pre-baked dependency base image.
rose-backend-base:<lockhash>rebuilt only whenpoetry.lockchanges. PR builds become COPY-only (≈30 s). - Tests don't block preview deploys. Two parallel Cloud Build triggers per PR —
pr-preview-deploy(no tests, ~2 min) andpr-checks(mypy + tests + lint, async). Reviewer gets the URL fast; CI status arrives separately. - BuildKit cache +
--cache-fromon Cloud Build, reusing intermediate layers across PRs.
Realistic target: 2–4 min git push → live preview URL. Railway-class without the migration. Northflank is the fallback if the gap proves unacceptable.
References¶
- OpenAI Symphony SPEC.md
- OpenAI Symphony announcement
- Linear Agents — Getting Started
- Linear Agent Interaction SDK
- Linear Agent demo (linear/linear-agent-demo)
- Cloud Run deployment previews tutorial
- GoogleCloudPlatform/devrel-demos: cloud-run-deployment-previews
- Supabase Branching docs
- Supabase GitHub integration
- Supabase Branching 2.0
- ADR 2026-03-04: Agentic Development Pipeline (superseded)