Ark Viz · architecture sketch

Deterministic control plane, agent judgment plane

This is the hybrid model for a Ralph-style process on OpenClaw. The orchestration core should be deterministic code. Linear stays human-friendly via MCP, OpenClaw acts as the operator shell, and LLM workers are only invoked where judgment is actually needed.

Deterministic script = orchestrator Linear = projection cockpit OpenClaw = operator shell LLMs = scoped worker judgment GitHub/CI = execution substrate
Control plane
Code
Dependency resolution, retries, projections, and state transitions stay deterministic.
Judgment plane
LLM
Only used for scoping, coding, reviewing, repairing, and summarizing ambiguity.
Projection rule
Internal state is summarized outward, never inferred back from comments alone.
Main win
Audit
You can reconstruct exactly what happened on every task run.

High-level architecture

The key move is to avoid making either Linear or prompt-driven skills do the job of a workflow engine. A deterministic orchestrator script owns reconciliation and state. OpenClaw wraps that control plane with visibility, messaging, session spawning, and intervention.

Human and planning surface
Linear workspace
MCP-facing
Teams, projects, issues, cycles, and human comments live here.
  • Parent task and subtask hierarchy
  • Priority, owner, milestone, labels
  • Human approvals and escalations
  • Readable progress summaries
Linear MCP adapter
Bridge
Read and write only the fields humans care about.
  • Sync issue status and assignee changes
  • Create comments for milestones only
  • Listen for operator inputs as signals
Deterministic control plane
Scripted reconciler + policy engine
Deterministic
Periodically or event-driven, computes the next truth from all observed inputs.
  • Build normalized task graph from Linear
  • Compute ready, blocked, awaiting_human
  • Apply concurrency, retry, stale-task, and projection rules
  • No LLM required for the same-input same-output path
dispatches
Worker adapters + runtime
Judgment plane
Deterministic wrappers launch OpenClaw agents or ACP sessions only when judgment is needed.
  • Scoper, coder, reviewer, repairer
  • One isolated worktree per active coding attempt
  • Can target GitHub, CI, and local repo tools
  • Invoked explicitly, not used to decide routine orchestration
emits events
Runtime state store
Source of truth
This is the part Linear should not replace.
  • tasks normalized from planning tools
  • task_runs attempt-by-attempt execution ledger
  • task_events append-only timeline
  • artifacts prompts, PRs, CI, logs, diffs
Operator shell and observability edges
GitHub + CI substrate
External
Repo state, PRs, checks, and review comments are ingested as facts.
  • PR open, review changes requested, CI failed
  • These become internal events, not just comments
OpenClaw shell + Ark Viz
Operator UX
Human-friendly shell around the deterministic core.
  • Parent issue board
  • Task timeline and replay
  • Fleet metrics and stuck-task queue
  • Chat approvals, nudges, interventions, and debug views
Human planning and projection
Deterministic control plane
Agent judgment plane
Durable internal truth
External execution systems

Projection model, what goes to Linear vs what stays internal

Keep Linear coarse and intentional. Keep OpenClaw detailed and forensic.

Concern Linear OpenClaw internal
Human-visible task status yes Projected from current internal state
Run attempts and retries summary only Full attempt ledger with start, stop, exit reason
Prompt bundles and worker config no Artifact records linked to each run
PR and CI milestones milestone comments Structured event entries with URLs and classification
Failure taxonomy human summary tests_failed, spec_ambiguous, review_rejected, etc.
Exact timeline replay no Append-only event stream
Task
logical
One normalized work item derived from Linear issue data.
Task run
attempt
One worker attempt against one task, with session and model metadata.
Artifact
evidence
Prompt, diff summary, PR URL, CI run, review digest, or final resolution.
Design rule: Linear comments should be narrative summaries, not storage for raw execution data. If a run cannot be replayed without reading a comment thread, the architecture is under-instrumented.

Task state machine

This is the richer internal lifecycle that gets projected into a much simpler Linear workflow. It separates execution truth from PM-facing status.

Discovered
Imported from Linear, normalized, not yet reconciled against dependencies or policy.
Blocked
Cannot proceed because dependencies, permissions, or missing inputs are unresolved.
Ready
Dependencies satisfied and eligible for dispatch under concurrency and budget rules.
reconcile
deps clear
policy ok
Queued
Selected for execution, waiting for worker capacity or worktree allocation.
Coding
Active scoper or coder run is executing against the task.
Awaiting review / CI
PR exists, tests and review outcomes are being collected and classified.
spawn
PR / checks
human gate
Completed
Success criteria met, artifacts captured, projection updated back to Linear.
Revising
A follow-up repair run is fixing review or CI failures.
Awaiting human / Failed
Confidence too low, retry budget exhausted, or explicit operator decision required.

Important nuance: task state is different from run state. One task can have multiple task runs over time.

Internal schema, visually

This is the minimum shape I’d want for implementation. It’s enough for replay, debugging, and dashboards without becoming over-designed too early.

tasks
logical item
  • idstable task key
  • external_sourcelinear issue reference
  • parent_idhierarchy
  • orchestration_stateruntime status
  • priority / owner / labelsnormalized planning fields
  • last_projected_atprojection hygiene
task_dependencies
graph edge
  • task_iddependent task
  • depends_on_task_idblocking task
  • kindhard, soft, review gate
task_runs
attempt ledger
  • id / task_id / attemptidentity
  • worker_typescoper, coder, reviewer, repairer
  • agent_id / session_key / modelexecution metadata
  • statusactive, succeeded, failed, timed_out
  • started_at / finished_attiming
  • exit_reasonclassified failure or completion
task_events
append-only
  • id / task_id / run_idlinkage
  • tsordered timeline
  • kindstate_transition, pr_opened, ci_failed
  • actor_type / actor_idhuman, agent, system
  • data_jsonstructured evidence payload
artifacts
evidence store
  • id / task_id / run_idlinkage
  • kindprompt, diff, PR, CI, summary
  • uridurable pointer
  • metadata_jsonshape varies by artifact type
projection cache
optional
  • task_idprojection target
  • linear_statuslast pushed state
  • last_comment_hashavoid noisy duplicate comments
  • projected_atsync timestamp

Dashboard wireframe, parent task / subtask observability

This is the view I’d want while running a parent initiative. Left for the frontier, center for exact task history, right for rescue and heat signals.

Initiative overview
Children
12
Active
3
Blocked
2
Done
7
#12 Fix local setup
done
PR #88 merged1 attempt43 min
#13 Add payments scaffold
coding
codex worker activeattempt 221 min in state
#14 Deploy preview auth
blocked
waiting on #13human input not needed
#15 Admin analytics card
awaiting review
PR #91CI greenreview requested
Selected task timeline, #13 Add payments scaffold
18:12
state_transition discovered → ready, dependencies resolved after #12 completion.
18:14
run_spawned coder attempt 1 launched with worktree wt-13-a1.
18:29
ci_failed lint + typecheck broke, classified as tests_failed.
18:32
run_spawned repairer attempt 2 launched with failure context and diff summary attached.
18:48
pr_opened PR #93 created, checks pending, Linear projected to In Progress.
Rescue and heat signals
Longest in state
#14 Deploy preview auth, blocked 5h 22m
Retry pressure
#13 Add payments scaffold, 2 runs, still healthy
Human gate queue
1 task needs a merge decision, no spec clarifications pending
Failure mix
Mostly CI and review churn, no auth/systemic failures detected
What this dashboard should answer instantly
  • Which subtasks are actually runnable right now?
  • Why is a blocked task blocked?
  • What happened in the last failed run, exactly?
  • Which tasks are consuming retries or human attention?
  • What should the operator look at next?

Budget-aware packaging plan

To keep the OpenClaw workspace healthy, large architecture and orchestration material should live in artifacts, not in always-loaded prompt files. The goal is to keep root prompt context tiny and move operational complexity into skills, docs, and implementation repos.

Prompt root
Keep tiny pointers, rules, defaults, one-line reminders
Examples MEMORY.md, AGENTS.md, SOUL.md
Never store full architecture docs, schemas, or long SOPs
knowledge/
Best for durable design docs and distilled decisions
Examples solution writeups, conventions, architecture notes
Rule reference from root, don’t inline into root
skills/
Best for reusable workflows, SOPs, scripts, operational logic
Examples dispatcher flows, Linear sync routines, review loops
Rule if it’s procedural and repeatable, it probably belongs here
Implementation repo
Best for real specs, schema migrations, code, tests, docs
Examples docs/, db/, src/
Rule buildable systems should eventually leave the workspace
Low risk, current viz approach Large HTML in viz/ is fine. It is not part of the always-loaded root prompt files.
Medium risk Letting architecture decisions sprawl across chat, scratch notes, and root memory without consolidation.
High risk Copying long specs into MEMORY.md or bloating AGENTS.md with implementation detail.
Recommended path Keep this viz as the visual explainer, add one distilled solution doc in knowledge/solutions/, and move real implementation work into the target repo or a dedicated skill.
Practical rule of thumb
  • If humans need to read it once, publish or document it.
  • If agents need to repeatedly follow it, make it a skill.
  • If the runtime depends on it, put it in the implementation repo.
  • If it must live in root prompt context, compress it to a pointer and one sentence.

Deterministic control loop

This is the operational heartbeat of the system, and it should be code, not vibes.

1. Ingest
pull + webhook
Read Linear changes, GitHub facts, and worker completions. Convert everything into canonical internal events.
2. Reconcile
deterministic
Compute next state for every task. Decide whether to dispatch, retry, block, escalate, or complete using deterministic policy.
3. Project
human-facing
Update Linear status, add a compact summary comment, and refresh Ark Viz/dashboard views without asking an LLM to manage routine state.
Suggested coarse Linear statuses mapped from richer internal state
  • todo ← discovered, blocked-but-unstarted
  • in_progress ← queued, coding, revising, awaiting_ci
  • blocked ← blocked, awaiting_human, auth_or_permissions
  • in_review ← awaiting_review, pr_open, awaiting_decision
  • done ← completed