Skip to content

ADR: Linear-Agents relay tracker for the factory orchestrator

Status

Accepted

Date

2026-06-11

Context

The declarative factory (factory/) discovers work by polling Linear every 30s, using one personal API key per bot identity (benoit-claude, benoit-codex, thomas-codex, …) filtered by label + assignee: me. This works but has real friction:

  • One Linear seat per bot identity.
  • Personal API keys instead of OAuth — hard to rotate, no per-agent scopes.
  • A polling loop with no native "thinking / working / done" status in the Linear UI.
  • Factory-authored comments are not natively attributed.

Linear ships a first-class Agents primitive: an OAuth app installed with actor=app becomes a real workspace user. Assigning or @-mentioning it fires AgentSessionEvent webhooks; the agent reports progress via agentActivityCreate activities and Linear renders the lifecycle natively.

Webhooks are push, but the factory orchestrator is a pull loop that often runs on a laptop with no public ingress. We need to bridge the two without exposing the laptop and without a per-bot Linear seat.

Decision

Add a second, opt-in tracker selectable per agent in agents.yaml (tracker: api_key — default, unchanged — vs tracker: agent_session), backed by a small hosted relay:

Linear ──webhook──▶ RELAY (Cloud Run, --min-instances=1)
                    verify Linear-Signature + webhookTimestamp; dedupe on event
                    id; hydrate the issue; post the <10s ack "thought"; enqueue
                    to a Cloud Storage bucket (no Firestore).
        ▲ POST /activity                          ▲ agentActivityCreate
        │                                         │
   LAPTOP worker ── GET /events?after=<cursor> ───┘  (long-poll, outbound only)
  • The relay owns the public surface (webhook + OAuth callback) and holds each agent's OAuth token in Secret Manager. The laptop holds no Linear credentials — only the relay URL and a worker bearer token.
  • The laptop tracker (factory/tracker/linear_agents.py) implements the existing IssueClient protocol (factory/orchestrator/loop.py) against the relay, so the orchestrator dispatch loop is unchanged: fetch_labelled_active_issues long-polls /events; create_comment posts a Linear response activity via /activity; update_issue_state is a no-op (status transitions are tracked separately).
  • Dispatch is native Linear assignment / @-mention of the agent's app user — no labels, no assignee: me.
  • Multiple agents, one relay, N apps: one OAuth app per identity (one appUserId each). The relay maps each inbound event's appUserId to an agent id and a per-agent Firestore queue. Workers scope themselves with GET /events?agents=<ids> gated by their worker token.

The legacy api_key polling path keeps working for every agent not opted in.

Consequences

Positive

  • No per-bot Linear seat; OAuth tokens are rotatable and scoped.
  • Native "Working… → Done" lifecycle and attribution in the Linear UI.
  • Laptop needs no inbound exposure (no tunnel); stable webhook + OAuth URLs.
  • Offline-safe: the relay posts the sub-10s ack itself and durably queues events, so Linear never hard-fails when a laptop is off — work drains from the cursor on next boot.
  • Dual-tracker rollout: opt in one agent at a time, zero risk to the rest.

Negative

  • A new always-on hosted component (Cloud Run + a GCS bucket) to operate.
  • One OAuth app per identity to create/install (admin step).
  • A session goes stale if no laptop processes it within ~30min; mitigated by re-assign, or later by a relay keep-alive thought.

Neutral

  • A Cloud Storage bucket is the durable queue — one object per event, plain lexicographic ordering as the cursor (Firestore/Pub/Sub deferred, avoided to keep no Firebase footprint). The queue is abstracted behind EventQueue, with an in-memory implementation for tests/local.

Alternatives Considered

  • Local cloudflared tunnel to a laptop receiver. Rejected: unstable webhook/OAuth URLs, puts Linear credentials and an inbound port on the laptop, and the 10s ack depends on laptop/engine availability.
  • Cloudflare Worker relay. Rejected for v1: splits secrets out of GCP Secret Manager and needs Durable Objects/KV for the durable queue + long-poll.
  • One OAuth app representing several agents. Not possible — Linear ties one appUserId to one installed app.